llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	01ae462fef	[X86][SSE] Combine (some) target shuffles with multiple uses As discussed on D41794, we have many cases where we fail to combine shuffles as the input operands have other uses. This patch permits these shuffles to be combined as long as they don't introduce additional variable shuffle masks, which should reduce instruction dependencies and allow the total number of shuffles to still drop without increasing the constant pool. However, this may mean that some memory folds may no longer occur, and on pre-AVX require the occasional extra register move. This also exposes some poor PMULDQ/PMULUDQ codegen which was doing unnecessary upper/lower calculations which will in fact fold to zero/undef - the fix will be added in a followup commit. Differential Revision: https://reviews.llvm.org/D50328 llvm-svn: 339335	2018-08-09 12:30:02 +00:00
Clement Courbet	7db69cc08a	[X86] Fix skylake server scheduling info. Summary: This fixes most of the scheduling info for SKX vector operations. I had to split a lot of the YMM/ZMM classes into separate classes for YMM and ZMM. The before/after llvm-exegesis analysis are in the phabricator diff. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47721 llvm-svn: 334407	2018-06-11 14:37:53 +00:00
Sanjay Patel	59313be8d3	[CodeGen] assume max/default throughput for unspecified instructions This is a fix for the problem arising in D47374 (PR37678): https://bugs.llvm.org/show_bug.cgi?id=37678 We may not have throughput info because it's not specified in the model or it's not available with variant scheduling, so assume that those instructions can execute/complete at max-issue-width. Differential Revision: https://reviews.llvm.org/D47723 llvm-svn: 334055	2018-06-05 23:34:45 +00:00
Simon Pilgrim	1273f4ad93	[X86] Add GPR<->XMM Schedule Tags BtVer2 - fix NumMicroOp and account for the Lat+6cy GPR->XMM and Lat+1cy XMm->GPR delays (see rL332737) The high number of MOVD/MOVQ equivalent instructions meant that there were a number of missed patterns in SNB/Znver1: SNB - add missing GPR<->MMX costs (taken from Agner / Intel AOM) Znver1 - add missing GPR<->XMM MOVQ costs (taken from Agner) llvm-svn: 332745	2018-05-18 17:58:36 +00:00
Simon Pilgrim	007b50fd35	[X86][BtVer2] Improve simulation of (V)PINSR values Include the 6cy delay transferring from the GPR to FPU. llvm-svn: 332737	2018-05-18 17:09:41 +00:00
Simon Pilgrim	3ecb0b80f6	[X86][BtVer2] Partial vector stores (inc MMX) have a 2cy latency llvm-svn: 332722	2018-05-18 14:22:22 +00:00
Simon Pilgrim	be9a206883	[X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classes BtVer2 - Fixes schedules for (V)CVTPS2PD instructions A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332376	2018-05-15 17:36:49 +00:00
Simon Pilgrim	f3ae50fca2	[X86] Split WriteFRcp/WriteFRsqrt/WriteFSqrt schedule classes WriteFRcp/WriteFRsqrt are split to support scalar, XMM and YMM/ZMM instructions. WriteFSqrt is split into single/double/long-double sizes and scalar, XMM, YMM and ZMM instructions. This removes all InstrRW overrides for these instructions. NOTE: There were a couple of typos in the Znver1 model - notably a 1cy throughput for SQRT that is highly unlikely and doesn't tally with Agner. NOTE: I had to add Agner's numbers for several targets for WriteFSqrt80. llvm-svn: 331629	2018-05-07 11:50:44 +00:00
Chandler Carruth	16429acacb	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997	2018-04-26 21:46:01 +00:00
Alexander Ivchenko	e8fed1546e	Lowering x86 adds/addus/subs/subus intrinsics (llvm part) This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44785 llvm-svn: 330322	2018-04-19 12:13:30 +00:00
Craig Topper	e56a2fc5e7	[X86] Add separate scheduling class for PSADBW instruction. llvm-svn: 330204	2018-04-17 19:35:19 +00:00
Simon Pilgrim	8fc2b49620	[X86][Atom] Convert Atom scheduler model to SchedRW (PR32431) Atom is the only x86 target that still uses schedule itineraries, if we can remove this then we can begin the work on removing x86 itineraries. I've also found that it will help with PR36550. I've focussed on matching the existing model as closely as possible (relying on the schedule tests), PR36895 indicated a lot of these were incorrect but we can just as easily fix these after this patch as before. Hopefully we can get llvm-exegesis to help here, There are a few instructions that rely on itinerary scheduling (mainly push/pop/return) of multiple resource stages, but I don't think any of these are show stoppers. There are also a few codegen changes that seem related to the post-ra scheduler acting a little differently, I haven't tracked these down but they don't seem critical. NOTE: I don't have access to any Atom hardware, so this hasn't been tested in the wild. Differential Revision: https://reviews.llvm.org/D45486 llvm-svn: 329837	2018-04-11 18:23:01 +00:00
Simon Pilgrim	86588fc809	[X86][Btver2] Add vector extract costs llvm-svn: 329524	2018-04-08 11:26:26 +00:00
Craig Topper	4cc3827791	[X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler model llvm-svn: 329351	2018-04-05 21:40:32 +00:00
Craig Topper	c6bb36a3d0	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge. We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339	2018-04-05 20:04:06 +00:00
Craig Topper	15303dda0d	[X86] Revert r329251-329254 It's failing on the bots and I'm not sure why. This reverts: [X86] Synchronize the SchedRW on some EVEX instructions with their VEX equivalents. [X86] Use WriteFShuffle256 for VEXTRACTF128 to be consistent with VEXTRACTI128 which uses WriteShuffle256. [X86] Remove some InstRWs for plain store instructions on Sandy Bridge. [X86] Auto-generate complete checks. NFC llvm-svn: 329256	2018-04-05 05:19:36 +00:00
Craig Topper	6c4e08c835	[X86] Remove some InstRWs for plain store instructions on Sandy Bridge. We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329252	2018-04-05 04:42:01 +00:00
Craig Topper	96729cd64b	[X86][Silvermont] Use correct latency and throughput information for divide and square root in the scheduler model. Data taken from Table 16-17 in the Intel Optimization Manual. llvm-svn: 328962	2018-04-02 06:34:16 +00:00
Craig Topper	8104f266a4	[X86] Correct the throughput for divide instructions in Sandy Bridge/Haswell/Broadwell/Skylake scheduler models. Fixes most of PR36898. Still need to fix the 512-bit instructions, but Agner's tables don't have those. llvm-svn: 328960	2018-04-02 05:33:28 +00:00
Simon Pilgrim	a2f26788a3	[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664	2018-03-27 20:38:54 +00:00
Simon Pilgrim	86ea53123d	[X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR..... llvm-svn: 328551	2018-03-26 17:02:02 +00:00
Simon Pilgrim	8815105cd5	[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs llvm-svn: 328541	2018-03-26 16:24:13 +00:00
Simon Pilgrim	0b73b29388	[X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write) This also adds missing vcvttss2si tests llvm-svn: 328505	2018-03-26 15:30:47 +00:00
Simon Pilgrim	67df1cf597	[X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps llvm-svn: 328497	2018-03-26 14:03:40 +00:00
Craig Topper	6f28d3c954	[X86] Fix the SchedRW for intrinsic register form of SQRT/RCP/RSQRT. llvm-svn: 328474	2018-03-26 05:05:12 +00:00
Craig Topper	cdfcf8ecda	[X86] Merge the SSE and AVX versions of fp divs and sqrts in the SandyBridge/Haswell/Broadwell/Skylake scheduler models. I've used Agner's data as best I could to get the values to converge on. llvm-svn: 328473	2018-03-26 05:05:10 +00:00
Craig Topper	fbf2d850e3	[X86] Add itinerary to intrinsic version of sqrtss, rcpss, and rsqrtss instructions. llvm-svn: 328472	2018-03-26 04:20:36 +00:00
Craig Topper	659f85af14	[X86] Swap the itineraries on the memory and register forms of CVTDQ2PD. They were backwards. llvm-svn: 328469	2018-03-26 02:17:13 +00:00
Simon Pilgrim	91fe24b8cf	[X86][SSE] Ensure we're testing both non-VEX/VEX variants of SSE instructions on AVX targets And ensure we don't use later instruction sets in SSE schedule tests llvm-svn: 328423	2018-03-24 14:51:52 +00:00
Simon Pilgrim	e5c0a041ff	[X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unit Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338	2018-03-23 17:38:59 +00:00
Craig Topper	4787b7f434	[X86] Correct the latencies of SNB integer vector multiplies based on Agner's data. Add missing MMX multiplies. llvm-svn: 328295	2018-03-23 06:41:43 +00:00
Craig Topper	7580a7997d	[X86] Change VPSADBW itinerary to SSE_INTALU_ITINS_P to match the SSE version. llvm-svn: 328293	2018-03-23 06:41:40 +00:00
Craig Topper	7f142b8bf1	[X86] Merge VMOVMSKBrr and MOVMSKBrr in the SNB sheduler model. The VMOVMSKBrr was in a separate InstRW with a lower latency, but I assume they should be the same and the higher latency matches Agners table so I'm going with that. llvm-svn: 328291	2018-03-23 06:41:38 +00:00
Craig Topper	fae4173b47	[X86] Add VEXTRB/W/D/Q to Zen scheduler model. The SSE versions were present, but not the VEX version. llvm-svn: 328290	2018-03-23 06:41:36 +00:00
Craig Topper	58afb4ea58	[X86][SkylakeClient] Fix a bunch of instructions that were incorrectly assigned Port015 instead of Port01. The VEC ADD and VEC MUL units aren't present on port 5 on SkylakeClient. llvm-svn: 328241	2018-03-22 21:10:07 +00:00
Craig Topper	89dcda3e90	[X86] Remove MMX_MASKMOVQ64 and VMASKMOVDQU from scheduler models. The information was so wildly inaccurate and incomplete its better to just remove it. MMX_MASKMOVQ64 showed up twice in several scheduler models. In Haswell and Broadwell they were on adjacent lines. On Skylake the copies had different information. MMX_MASKMOVQ and MASKMOVDQU were completely missing. MMX_MASKMOVQ64 was listed on Haswell/Broadwell as 1 cycle on port 1 despite it being a store instruction. Filed PR36780 to track fixing this right. llvm-svn: 327783	2018-03-18 03:24:42 +00:00
Simon Pilgrim	fb7aa57bf1	[X86][SSE] Introduce Float/Vector WriteMove, WriteLoad and Writetore scheduler classes As discussed on D44428 and PR36726, this patch splits off WriteFMove/WriteVecMove, WriteFLoad/WriteVecLoad and WriteFStore/WriteVecStore scheduler classes to permit vectors to be handled separately from gpr/scalar types. I've minimised the diff here by only moving various basic SSE/AVX vector instructions across - we can fix the rest when called for. This does fix the MOVDQA vs MOVAPS/MOVAPD discrepancies mentioned on D44428. Differential Revision: https://reviews.llvm.org/D44471 llvm-svn: 327630	2018-03-15 14:45:30 +00:00
Puyan Lotfi	43e94b15ea	Followup on Proposal to move MIR physical register namespace to '$' sigil. Discussed here: http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html In preparation for adding support for named vregs we are changing the sigil for physical registers in MIR to '$' from '%'. This will prevent name clashes of named physical register with named vregs. llvm-svn: 323922	2018-01-31 22:04:26 +00:00
Craig Topper	05af43fbad	[X86] Fix some inconsistencies in the itineraries and Sched for (V)PEXTRW/(V)PINSRW The weirdest being that PEXTRWrr was tagged as a memory operation. llvm-svn: 323353	2018-01-24 17:58:57 +00:00
Craig Topper	002657731b	[X86] Move 'Int_' to the end of the name of the VCOMISS/VUCOMISS and instructions to get them picked up by the scheduler model regexs. All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up. llvm-svn: 323261	2018-01-23 21:37:51 +00:00
Simon Pilgrim	a8e6b885bd	[X86][BTVER2] Fix scheduling of VCMPSD/VCMPSS instructions For some reason they don't have a trailing i like the packed equivalents. llvm-svn: 322600	2018-01-16 22:15:41 +00:00
Andrew V. Tischenko	e58c0c96b2	Update BTVER2 sched numbers for some AVX instructions (xmm version). Differential Revision: https://reviews.llvm.org/D40067 llvm-svn: 322485	2018-01-15 14:21:11 +00:00
Craig Topper	162439dcdf	[X86] Pass itins.rr/itins.rm through properly for some instructions. llvm-svn: 321452	2017-12-26 05:43:05 +00:00
Sanjoy Das	1074eb225b	Reapply "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320508, in effect re-applying r320308. Simon has already reverted the parts that caused the crash that motivated the revert in r320492. llvm-svn: 320512	2017-12-12 19:11:31 +00:00
Sanjoy Das	81a4a02cbc	Revert "[X86] Flag BroadWell scheduler model as complete" This reverts commit r320308. r320308 crashes LLC, please see the llvm-commits thread for a reproducer. llvm-svn: 320508	2017-12-12 18:40:58 +00:00
Craig Topper	a0be5a06c1	[X86] Rename some instructions that start with Int_ to have the _Int at the end. This matches AVX512 version and is more consistent overall. And improves our scheduler models. In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses. llvm-svn: 320325	2017-12-10 19:47:56 +00:00
Simon Pilgrim	1f8cfba0bb	[X86] Flag BroadWell scheduler model as complete Locally tag COPY as WriteMove, which has caused some reg-reg + reg-mem instruction tests to reorder. llvm-svn: 320308	2017-12-10 13:49:51 +00:00
Craig Topper	90c9c15936	[X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information The VEX versions were present but not the legacy SSE versions. llvm-svn: 320294	2017-12-10 09:14:44 +00:00
Gadi Haber	2cf601f28f	[X86][Haswell]: Updating the scheduling information for the Haswell subtarget. Updated the scheduling information for the Haswell subtarget with the following changes: Regrouped the instructions after adding appropriate load + store latencies. Added scheduling for missing instructions such as the GATHER instrs. The changes were made after revisiting the latencies impact of all memory uOps. Reviewers: RKSimon, zvi, craig.topper, apilipenko Differential Revision: https://reviews.llvm.org/D40021 Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7 llvm-svn: 320137	2017-12-08 09:48:44 +00:00
Andrew V. Tischenko	44cfc51415	Add proper BTVER2 sched support for MOV instr. Differential Revision: https://reviews.llvm.org/D40345 llvm-svn: 320034	2017-12-07 11:19:49 +00:00

1 2

81 Commits