llvm-project

Commit Graph

Author	SHA1	Message	Date
Saleem Abdulrasool	432b88e5f4	CodeGen: support SwiftError SwiftCC on Windows x64 Add support for passing SwiftError through a register on the Windows x64 calling convention. This allows the use of swifterror attributes on parameters which is used by the swift front end for the `Error` parameter. This partially enables building the swift standard library for Windows x86_64. llvm-svn: 313791	2017-09-20 18:40:59 +00:00
Simon Pilgrim	d202ad15c1	[X86][SSE] Add PR22415 test case llvm-svn: 313755	2017-09-20 13:49:52 +00:00
Florian Hahn	ceb4494786	Recommit [MachineCombiner] Update instruction depths incrementally for large BBs. This version of the patch fixes an off-by-one error causing PR34596. We do not need to use std::next(BlockIter) when calling updateDepths, as BlockIter already points to the next element. Original commit message: > For large basic blocks with lots of combinable instructions, the > MachineTraceMetrics computations in MachineCombiner can dominate the compile > time, as computing the trace information is quadratic in the number of > instructions in a BB and it's relevant successors/predecessors. > In most cases, knowing the instruction depth should be enough to make > combination decisions. As we already iterate over all instructions in a basic > block, the instruction depth can be computed incrementally. This reduces the > cost of machine-combine drastically in cases where lots of instructions > are combined. The major drawback is that AFAIK, computing the critical path > length cannot be done incrementally. Therefore we only compute > instruction depths incrementally, for basic blocks with more > instructions than inc_threshold. The -machine-combiner-inc-threshold > option can be used to set the threshold and allows for easier > experimenting and checking if using incremental updates for all basic > blocks has any impact on the performance. > > Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn > > Reviewed By: fhahn > > Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits > > Differential Revision: https://reviews.llvm.org/D36619 llvm-svn: 313751	2017-09-20 11:54:37 +00:00
Mikael Holmen	06064d1bac	[IfConversion] Add testcases [NFC] These tests should have been included in r310697 / D34099 but apparently I missed them. llvm-svn: 313737	2017-09-20 08:23:29 +00:00
Matt Arsenault	b81495dccb	AMDGPU: Match load d16 hi instructions Also starts selecting global loads for constant address in some cases. Some end up selecting to mubuf still, which requires investigation. We still get sub-optimal regalloc and extra waitcnts inserted due to not really tracking the liveness of the separate register halves. llvm-svn: 313716	2017-09-20 05:01:53 +00:00
Stanislav Mekhanoshin	5670e6d482	[AMDGPU] Port of HSAIL inliner Differential Revision: https://reviews.llvm.org/D36849 llvm-svn: 313714	2017-09-20 04:25:58 +00:00
Matt Arsenault	fcc213fab7	AMDGPU: Match store d16_hi instructions llvm-svn: 313712	2017-09-20 03:20:09 +00:00
Quentin Colombet	d652aeb144	[MIRPrinter] Print empty successor lists when they cannot be guessed This re-applies commit r313685, this time with the proper updates to the test cases. Original commit message: Unreachable blocks in the machine instr representation are these weird empty blocks with no successors. The MIR printer used to not print empty lists of successors. However, the MIR parser now treats non-printed list of successors as "please guess it for me". As a result, the parser tries to guess the list of successors and given the block is empty, just assumes it falls through the next block (if any). For instance, the following test case used to fail the verifier. The MIR printer would print entry / \ true (def) false (no list of successors) \| split.true (use) The MIR parser would understand this: entry / \ true (def) false \| / <-- invalid edge split.true (use) Because of the invalid edge, we get the "def does not dominate all uses" error. The fix consists in printing empty successor lists, so that the parser knows what to do for unreachable blocks. rdar://problem/34022159 llvm-svn: 313696	2017-09-19 23:34:12 +00:00
Quentin Colombet	6888dbcda7	Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed" This reverts commit r313685. I thought I had ran ninja check, but apparently I didn't... Need to update a bunch of mir tests. llvm-svn: 313686	2017-09-19 22:03:50 +00:00
Quentin Colombet	7fdaa5e641	[MIRPrinter] Print empty successor lists when they cannot be guessed Unreachable blocks in the machine instr representation are these weird empty blocks with no successors. The MIR printer used to not print empty lists of successors. However, the MIR parser now treats non-printed list of successors as "please guess it for me". As a result, the parser tries to guess the list of successors and given the block is empty, just assumes it falls through the next block (if any). For instance, the following test case used to fail the verifier. The MIR printer would print entry / \ true (def) false (no list of successors) \| split.true (use) The MIR parser would understand this: entry / \ true (def) false \| / <-- invalid edge split.true (use) Because of the invalid edge, we get the "def does not dominate all uses" error. The fix consists in printing empty successor lists, so that the parser knows what to do for unreachable blocks. rdar://problem/34022159 llvm-svn: 313685	2017-09-19 21:55:51 +00:00
Stanislav Mekhanoshin	d4ae470d2e	[AMDGPU] Prevent post-RA scheduler from breaking memory clauses The pre-RA scheduler does load/store clustering, but post-RA scheduler undoes it. Add mutation to prevent it. Differential Revision: https://reviews.llvm.org/D38014 llvm-svn: 313670	2017-09-19 20:54:38 +00:00
Ulrich Weigand	59a01a958a	[SystemZ] Fix truncstore + bswap codegen bug SystemZTargetLowering::combineSTORE contains code to transform a combination of STORE + BSWAP into a STRV type instruction. This transformation is correct for regular stores, but not for truncating stores. The routine neglected to check for that case. Fixes a miscompilation of llvm-objcopy with clang, which caused test suite failures in the SystemZ multistage build bot. llvm-svn: 313669	2017-09-19 20:50:05 +00:00
Tony Jiang	2d9c5f3b8b	[PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD. Two blocks prior to the join each perform an li and the the join block has an add using the initialized register. Optimize each predecessor block to instead use addi and delete the li's and add. Differential Revision: https://reviews.llvm.org/D36734 llvm-svn: 313639	2017-09-19 16:14:37 +00:00
Evandro Menezes	0a98abc67c	[AArch64] Extend tests of loads and stores of register pairs Include instances of FP register pairs. llvm-svn: 313638	2017-09-19 15:46:35 +00:00
Daniel Sanders	83e23d1398	[globalisel] Add a G_BSWAP instruction and support bswap using it. llvm-svn: 313633	2017-09-19 14:25:15 +00:00
Simon Pilgrim	d5e2878252	[X86][SSE] Add 'redundant pand' test case from PR34620 llvm-svn: 313632	2017-09-19 14:02:16 +00:00
Sanjay Patel	bd7958d7ca	[x86] regenerate checks; NFC llvm-svn: 313631	2017-09-19 13:43:09 +00:00
Daniel Sanders	000327742f	[globalisel] Add support for intrinsic_void llvm-svn: 313629	2017-09-19 13:23:01 +00:00
Daniel Sanders	28887fe548	[globalisel] Add support for intrinsic_w_chain. This maps directly to G_INTRINSIC_W_SIDE_EFFECTS. llvm-svn: 313627	2017-09-19 12:56:36 +00:00
Jina Nahias	ccfb8d4fe8	[x86] Lowering Mask Set1 intrinsics to LLVM IR This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37669 llvm-svn: 313625	2017-09-19 11:03:06 +00:00
Roger Ferrer Ibanez	8d0180c955	[ARM] Use ADDCARRY / SUBCARRY This is a preparatory step for D34515. This change: - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32 - lowering is done by first converting the boolean value into the carry flag using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two operations does the actual addition. - for subtraction, given that ISD::SUBCARRY second result is actually a borrow, we need to invert the value of the second operand and result before and after using ARMISD::SUBE. We need to invert the carry result of ARMISD::SUBE to preserve the semantics. - given that the generic combiner may lower ISD::ADDCARRY and ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering as well otherwise i64 operations now would require branches. This implies updating the corresponding test for unsigned. - add new combiner to remove the redundant conversions from/to carry flags to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C - fixes PR34045 - fixes PR34564 Differential Revision: https://reviews.llvm.org/D35192 llvm-svn: 313618	2017-09-19 09:05:39 +00:00
Andrei Elovikov	142516b456	Test commit. llvm-svn: 313617	2017-09-19 07:56:20 +00:00
Matt Arsenault	e745d9963e	AMDGPU: Run internalize symbols at -O0 The relocations used for externally visible functions aren't supported, so the direct call emitted ends up hitting a linker error. llvm-svn: 313616	2017-09-19 07:40:11 +00:00
Gadi Haber	6f8fbf4b86	[X86][Skylake] Adding the scheduling information for the SkylakeClient target This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879. Please expect some performance fluctuations due to code alignment effects. Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena Differential Revision: https://reviews.llvm.org/D37294 llvm-svn: 313613	2017-09-19 06:19:27 +00:00
Craig Topper	a80949feb5	[X86] Add VPERMPD/VPERMQ and VPERMPS/VPERMD to the execution domain fixing table. llvm-svn: 313610	2017-09-19 04:39:55 +00:00
Yonghong Song	9ef85f0677	bpf: add inline-asm support Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 313593	2017-09-18 23:29:36 +00:00
Sanjay Patel	f31b1a00ea	[DAGCombiner] fold assertzexts separated by trunc If we have an AssertZext of a truncated value that has already been AssertZext'ed, we can assert on the wider source op to improve the zext-y knowledge: assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN This moves a fold from being Mips-specific to general combining, and x86 shows improvements. Differential Revision: https://reviews.llvm.org/D37017 llvm-svn: 313577	2017-09-18 22:05:35 +00:00
Konstantin Zhuravlyov	ca8946a376	AMDGPU: Start selecting s_xnor_{b32, b64} Differential Revision: https://reviews.llvm.org/D37981 llvm-svn: 313565	2017-09-18 21:22:45 +00:00
Sanjay Patel	7765c93be2	[DAG, x86] allow store merging before and after legalization (PR34217) rL310710 allowed store merging to occur after legalization to catch stores that are created late, but this exposes a logic hole seen in PR34217: https://bugs.llvm.org/show_bug.cgi?id=34217 We will miss merging stores if the target lowers vector extracts into target-specific operations. This patch allows store merging to occur both before and after legalization if the target chooses to get maximum merging. I don't think the potential regressions in the other tests are relevant. The tests are for correctness of weird IR constructs rather than perf tests, and I think those are still correct. Differential Revision: https://reviews.llvm.org/D37987 llvm-svn: 313564	2017-09-18 20:54:26 +00:00
Craig Topper	39cdb84560	[X86] Make sure we still emit zext for GR32 to GR64 when the source of the zext is AssertZext The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext. This fixes PR28540. Differential Revision: https://reviews.llvm.org/D37729 llvm-svn: 313563	2017-09-18 20:49:13 +00:00
Sanjay Patel	74d12b5697	[x86] add tests for PR34217; NFC llvm-svn: 313548	2017-09-18 18:07:50 +00:00
Simon Pilgrim	4aa28b9730	[X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for 256-bit vector compare results. As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts llvm-svn: 313547	2017-09-18 17:58:31 +00:00
Sanjay Patel	078d5d978c	[x86] regenerate checks; NFC llvm-svn: 313545	2017-09-18 17:33:47 +00:00
Manoj Gupta	7476f629ed	[LoopVectorizer] Add more testcases for PR33804. Summary: Add test cases when float <-> pointer types conversion is triggered in presence of load instructions. Reviewers: Ayal, srhines, mkuper, rengolin Reviewed By: rengolin Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37967 llvm-svn: 313544	2017-09-18 17:28:15 +00:00
Simon Pilgrim	0b21ef1fa3	[SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign bits. For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements. We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results. Differential Revision: https://reviews.llvm.org/D37849 llvm-svn: 313543	2017-09-18 16:45:05 +00:00
Craig Topper	77d7f331dd	[X86] Fix two more places to prefer VPERMQ/PD over VPERM2X128 when AVX2 is enabled The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there. If we have VPERMQ/PD we should prefer those since they only have a single input. Differential Revision: https://reviews.llvm.org/D37947 llvm-svn: 313542	2017-09-18 16:39:49 +00:00
Simon Pilgrim	00161c9961	[X86][SSE] Improve support for vselect(Cond, 0, X) -> ANDN(Cond, X) As discussed on PR28925 and D37849. Differential Revision: https://reviews.llvm.org/D37975 llvm-svn: 313532	2017-09-18 14:23:23 +00:00
Simon Pilgrim	360629d170	[X86][SSE] Add vselect with zero tests (PR28925) llvm-svn: 313529	2017-09-18 13:32:33 +00:00
Nikolai Bozhenov	84af99b3b1	[X86FixupBWInsts] More precise register liveness if no <imp-use> on MOVs. Summary: Subregister liveness tracking is not implemented for X86 backend, so sometimes the whole super register is said to be live, when only a subregister is really live. That might happen if the def and the use are located in different MBBs, see added fixup-bw-isnt.mir test. However, using knowledge of the specific instructions handled by the bw-fixup-pass we can get more precise liveness information which this change does. Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper Reviewed By: craig.topper Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya Patch by Andrei Elovikov <andrei.elovikov@intel.com> Differential Revision: https://reviews.llvm.org/D37559 llvm-svn: 313524	2017-09-18 10:17:59 +00:00
Mohammed Agabaria	77cb080c2d	[X86][Codegen] adding masked gathers tests for avx2 related to patch: https://reviews.llvm.org/D35772 adding llvm gathers test before gathers codegen support. Differential Revision: https://reviews.llvm.org/D37800 llvm-svn: 313516	2017-09-18 06:49:54 +00:00
Craig Topper	a6054328e8	[X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain. MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter. llvm-svn: 313509	2017-09-18 04:40:58 +00:00
Craig Topper	87f7381edf	[X86] Teach execution domain fixing to convert between FP and int unpack instructions. llvm-svn: 313508	2017-09-18 03:29:54 +00:00
Craig Topper	d4341920d5	[X86] Teach execution domain fixing to convert between VPERMILPS and VPSHUFD. llvm-svn: 313507	2017-09-18 03:29:47 +00:00
Craig Topper	ee6646d7de	[X86] Teach shuffle lowering to use MOVLHPS/MOVHLPS for lowering v4f32 unary shuffles with SSE1 only. llvm-svn: 313504	2017-09-17 22:36:41 +00:00
Craig Topper	6c221690a3	[X86] Add a couple more unary shuffles to the sse1 shuffle test. These can be implemented with movlhps and movhlps. llvm-svn: 313503	2017-09-17 22:36:39 +00:00
Jatin Bhateja	356e3e2c1d	Adding test cases for PR34629 & PR34634. Differential Revision: https://reviews.llvm.org/D37962 llvm-svn: 313490	2017-09-17 18:16:26 +00:00
Igor Breger	f1d388a5c5	[GlobalISel][X86] Legalize i1 G_ADD/G_SUB/G_MUL/G_XOR/G_OR/G_AND instructions. llvm-svn: 313483	2017-09-17 11:34:17 +00:00
Igor Breger	0f382ccb68	[GlobalISel][X86] Use correct physical register in mir tests.NFC. llvm-svn: 313479	2017-09-17 08:30:42 +00:00
Igor Breger	21200ed7af	[GlobalISel][X86] G_FCONSTANT support. Summary: G_FCONSTANT support, port the implementation from X86FastIsel. Reviewers: zvi, delena, guyblank Reviewed By: delena Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D37734 llvm-svn: 313478	2017-09-17 08:08:13 +00:00
Sanjay Patel	65d6780703	[x86] enable storeOfVectorConstantIsCheap() target hook This allows vector-sized store merging of constants in DAGCombiner using the existing code in MergeConsecutiveStores(). All of the twisted logic that decides exactly what vector operations are legal and fast for each particular CPU are handled separately in there using the appropriate hooks. For the motivating tests in merge-store-constants.ll, we already produce the same vector code in IR via the SLP vectorizer. So this is just providing a backend backstop for code that doesn't go through that pass (-O1). More details in PR24449: https://bugs.llvm.org/show_bug.cgi?id=24449 (this change should be the last step to resolve that bug) Differential Revision: https://reviews.llvm.org/D37451 llvm-svn: 313458	2017-09-16 13:29:12 +00:00

1 2 3 4 5 ...

21569 Commits