llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	28e7bcbba6	[X86] Add WriteCRC32 scheduler class Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582	2018-03-26 21:06:14 +00:00
Simon Pilgrim	f33d905293	[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566	2018-03-26 18:19:28 +00:00
Craig Topper	cdfcf8ecda	[X86] Merge the SSE and AVX versions of fp divs and sqrts in the SandyBridge/Haswell/Broadwell/Skylake scheduler models. I've used Agner's data as best I could to get the values to converge on. llvm-svn: 328473	2018-03-26 05:05:10 +00:00
Simon Pilgrim	68a8fbc102	[X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI. llvm-svn: 328460	2018-03-25 20:16:53 +00:00
Simon Pilgrim	e3547af7be	[X86] Add the ability to override memory folding latency to schedules and add 1uop for memory folds for Intel models The Intel models need an extra 1uop for memory folded instructions, plus a lot of instructions take a non-default memory latency which should allow us to use the multiclass a lot more to tidy things up. Differential Revision: https://reviews.llvm.org/D44840 llvm-svn: 328446	2018-03-25 10:21:19 +00:00
Simon Pilgrim	2b5967f510	[X86][Haswell] Merge xmm/ymm instructions instregex entries to reduce regex matches to reduce compile time llvm-svn: 328432	2018-03-24 18:36:01 +00:00
Craig Topper	40d3b32e12	[X86] Rename VROUNDYPS* and VROUNDYPD* instructions to VROUNDPSY* and VROUNDPDY*. Fix itinerary mistake on all memory forms of VROUNDPD This makes the Y position consistent with other instructions. This should have been NFC, but while refactoring the multiclass I noticed that VROUNDPD memory forms were using the register itinerary. llvm-svn: 328254	2018-03-22 21:55:20 +00:00
Craig Topper	4a3be6e578	[X86] Correct the scheduling data for some of the 32 and 64 bit multiplies to as best as I understand how they are implemented. llvm-svn: 328231	2018-03-22 19:22:51 +00:00
Simon Pilgrim	53b2c3329a	[X86][SSE42] Use the default PCMPEST/PCMPIST scheduler classes directly. NFCI. Models were completely overriding all SSE42 strins instructions when the default classes could be used for exactly the same coverage. llvm-svn: 328203	2018-03-22 14:56:18 +00:00
Simon Pilgrim	3b2ff1faa9	[X86][CLMUL] Use the default CLMUL scheduler classes directly. NFCI. Models were completely overriding all CLMUL instructions when the WriteCLMUL default classes could be used for exactly the same coverage. llvm-svn: 328194	2018-03-22 13:37:30 +00:00
Simon Pilgrim	7684e055b3	[X86] Use the default AES scheduler classes directly. NFCI. Models were completely overriding all AES instructions when the WriteAES default classes could be used for exactly the same coverage. Removes 6 unnecessary scheduler classes from every model. Note: Still looking for a way for tblgen to warn when this is happening - often the override is more complete than the default. llvm-svn: 328192	2018-03-22 13:18:08 +00:00
Craig Topper	df7855fc8d	[X86] Remove unused SchedWriteRes classes. NFC llvm-svn: 328181	2018-03-22 04:52:08 +00:00
Simon Pilgrim	ec2f878779	[X86][Haswell] Merge multiple InstrRW entries that map to the same SchedWriteRes group (NFCI) (PR35955) I've also merged some VEX/non-VEX instregex strings with a (V?) prefix or (Y?) ymm variant - there are still a lot more of these to do. llvm-svn: 328111	2018-03-21 16:19:03 +00:00
Simon Pilgrim	62690e9d0e	[X86][Haswell][Znver1] Fix typo in fldl instregexs Missing comma was casing 2 instregex entries to be concatenated together by mistake. Found while investigating PR35548 llvm-svn: 327992	2018-03-20 15:44:47 +00:00
Craig Topper	3e9462607e	[X86] Add TEST16mi/TEST32mi/TEST64mi32 to the Sandybridge/Haswell/Broadwell/Skylake scheduler models. Move it from a load+store group on SNB to a load only group, the same group as CMP. llvm-svn: 327944	2018-03-20 03:02:03 +00:00
Craig Topper	b4c7873f8c	[X86] Add JCXZ/JECXZ to Sandybridge/Haswell/Broadwell/Skylake scheduler models. JRCXZ was already present, but not the others. We never codegen this instruction so this doesn't affect much just trying to get them all into a single generated scheduler class in the output. llvm-svn: 327881	2018-03-19 19:00:32 +00:00
Craig Topper	836cfb3a4c	[X86] Add the rest of the TEST with immediate instructions to the scheduler models to match their 8-bit counterpart. llvm-svn: 327874	2018-03-19 17:58:41 +00:00
Craig Topper	645e531a69	[X86] Add MOV16ri/MOV32ri/MOV64ri* to scheduler models to match MOV8ri. Correct SchedRW and itinerary for MOV32ri64. llvm-svn: 327872	2018-03-19 17:46:59 +00:00
Simon Pilgrim	30c38c3849	[X86] Generalize schedule classes to support multiple stages Currently the WriteResPair style multi-classes take a single pipeline stage and latency, this patch generalizes this to make it easier to create complex schedules with ResourceCycles and NumMicroOps be overriden from their defaults. This has already been done for the Jaguar scheduler to remove a number of custom schedule classes and adding it to the other x86 targets will make it much tidier as we add additional classes in the future to try and replace so many custom cases. I've converted some instructions but a lot of the models need a bit of cleanup after the patch has been committed - memory latencies not being consistent, the class not actually being used when we could remove some/all customs, etc. I'd prefer to keep this as NFC as possible so later patches can be smaller and target specific. Differential Revision: https://reviews.llvm.org/D44612 llvm-svn: 327855	2018-03-19 14:46:07 +00:00
Craig Topper	d10ceffa5f	[X86] Add ADD16i16/ADD32i32/ADD64i32 and similar to the scheduler models to match ADD8i8. Also move ADC8i8 and SBB8i8 in the Sandy Bridge model to the same class as ADC8ri and SBB8ri. That seems more accurate since its the 8i8 is just the register forced to AL instead of coming from modrm. llvm-svn: 327820	2018-03-19 04:21:40 +00:00
Craig Topper	9b60dcb29b	[X86] Merge 32 and 64-bit RORX/SHLX/SARX/SHRX into single regular expressions in scheduler models. llvm-svn: 327816	2018-03-19 00:56:11 +00:00
Craig Topper	13a1650d8a	[X86] Merge 8-bit instructions into instregex with 16/32/64 instructions in the scheduler models as much as possible. NFCI This reduces the total number of generated scheduler classes from 5404 to 5316. llvm-svn: 327815	2018-03-19 00:56:09 +00:00
Craig Topper	2d451e73f9	[X86] Fix a bunch of overlapping regular expressions in the scheduler models. llvm-svn: 327787	2018-03-18 08:38:06 +00:00
Craig Topper	89dcda3e90	[X86] Remove MMX_MASKMOVQ64 and VMASKMOVDQU from scheduler models. The information was so wildly inaccurate and incomplete its better to just remove it. MMX_MASKMOVQ64 showed up twice in several scheduler models. In Haswell and Broadwell they were on adjacent lines. On Skylake the copies had different information. MMX_MASKMOVQ and MASKMOVDQU were completely missing. MMX_MASKMOVQ64 was listed on Haswell/Broadwell as 1 cycle on port 1 despite it being a store instruction. Filed PR36780 to track fixing this right. llvm-svn: 327783	2018-03-18 03:24:42 +00:00
Simon Pilgrim	fb7aa57bf1	[X86][SSE] Introduce Float/Vector WriteMove, WriteLoad and Writetore scheduler classes As discussed on D44428 and PR36726, this patch splits off WriteFMove/WriteVecMove, WriteFLoad/WriteVecLoad and WriteFStore/WriteVecStore scheduler classes to permit vectors to be handled separately from gpr/scalar types. I've minimised the diff here by only moving various basic SSE/AVX vector instructions across - we can fix the rest when called for. This does fix the MOVDQA vs MOVAPS/MOVAPD discrepancies mentioned on D44428. Differential Revision: https://reviews.llvm.org/D44471 llvm-svn: 327630	2018-03-15 14:45:30 +00:00
Clement Courbet	327fac4d75	[X86] Add IMUL scheduling info on sandybridge, fix it on >=haswell. Summary: Only IMUL16rri uses an extra P0156. IMUL32* and IMUL16rr only use P1. This was computed using https://github.com/google/EXEgesis/blob/master/exegesis/tools/compute_itineraries.cc This can easily be validated by running perf on the following code: ``` int main(int argc, char**argv) { int a = argc; int b = argc; int c = argc; int d = argc; for (int i = 0; i < LOOP_ITERATIONS; ++i) { asm volatile( R"( .rept 10000 imull $0x2, %%edx, %%eax imull $0x2, %%ecx, %%ebx imull $0x2, %%eax, %%edx imull $0x2, %%ebx, %%ecx .endr )" : "+a"(a), "+b"(b), "+c"(c), "+d"(d) : :); } return a+b+c+d; } ``` -> test.cc perf stat -x, -e cycles --pfm-events=uops_executed_port:port_0:u,uops_executed_port:port_1:u,uops_executed_port:port_2:u,uops_executed_port:port_3:u,uops_executed_port:port_4:u,uops_executed_port:port_5:u,uops_executed_port:port_6:u,uops_executed_port:port_7:u test Reviewers: craig.topper, RKSimon, gadi.haber Subscribers: llvm-commits, gchatelet, chandlerc Differential Revision: https://reviews.llvm.org/D43460 llvm-svn: 326877	2018-03-07 08:14:02 +00:00
Craig Topper	b369cdbaad	[X86] Expand IMUL/MUL instregexs in Intel scheduler models. Add load latency to some of them in SkylakeClient model. The regular expressions and the imul names caused some instructions to be matched by multiple regexs creating unpredictable results. This changes them all to use explicit instrs instead. While doing this I also found that some instructions in Skylake were missing load latency so I fixed that too. llvm-svn: 323406	2018-01-25 06:57:42 +00:00
Craig Topper	066e73762d	[X86] Name the MMX phaddd instruction with 3 Ds instead of just 2. NFC llvm-svn: 323403	2018-01-25 04:45:32 +00:00
Craig Topper	dbddac0915	[X86] Remove 64/128/256 from MMX/SSE/AVX instruction names for overall consistency. NFC MMX instrutions all start with MMX_ so the 64 isn't needed for disambigutation. SSE/AVX1 instructions are assumed 128-bit so we don't need to say 128. AVX2 instructions should use a Y to indicate 256-bits. llvm-svn: 323402	2018-01-25 04:45:30 +00:00
Craig Topper	81c87092d1	[X86] Remove unnecessary '_alt' and '_Int' from scheduler model regular expressions. These were treated as optional suffixes, but the regular expressions are already prefix matches so this is unnecessary. It breaks the binary search optimization in tablegen due to the top level question mark. llvm-svn: 323401	2018-01-25 04:45:28 +00:00
Craig Topper	b85b484fee	[X86] Adjust names of PINSRW/PEXTRW intructions between MMX/SSE/AVX/AVX512 for consistency and to maybe enable more regular expression compaction in the scheduler models. NFCI llvm-svn: 323352	2018-01-24 17:58:51 +00:00
Craig Topper	23cc866c97	[X86] Remove '(_REV)?' from a bunch of scheduler regular expressions. NFC The regexs are treated as a prefix match already so the checking for optional text at the end provides no value. Instead it prevents the binary search optimization in tablegen from kicking in due to the top level question mark. llvm-svn: 323351	2018-01-24 17:58:42 +00:00
Craig Topper	f4cd9083ac	[X86] Make better use of instregex for cmovcc/setcc/jcc instructions in the Intel scheduler models. Combine all the separate condition codes into a singular expression when possible. llvm-svn: 322924	2018-01-19 05:47:32 +00:00
Craig Topper	de1d28e053	[X86] Remove duplicate lines from scheduler models. NFC llvm-svn: 322615	2018-01-17 03:50:21 +00:00
Craig Topper	a42a2ba221	[X86] Combine some more scheduler model entries using regular expressions. We had a lot of separate 32 and 64 instructions that had the same scheduling data. This merges them into the same regular expression. This is pretty consistent with a lot of other instructions. llvm-svn: 320924	2017-12-16 18:35:31 +00:00
Craig Topper	17a311831c	[X86] Use instrs instead of instregex for gather/scatter instructions in the scheduler models. Combine into single InstrRW entries. The reduces the number of scheduler groups in subtarget info. llvm-svn: 320923	2017-12-16 18:35:29 +00:00
Craig Topper	f82867c95a	Recommit r320461 "[X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions." I've hopefully sidestepped the MSVC issue that caused it to be reverted. We no longer include the Sched enum from X86GenInstrInfo.inc on the X86 target. So hopefully MSVC's preprocessor will skip over it and nothing will notice the 11000 character enum name. Original commit message: When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320655	2017-12-13 23:11:30 +00:00
Simon Pilgrim	0f8a5a41cf	Revert r320461 - causing ICE in windows buildss [X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions. When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320470	2017-12-12 11:34:25 +00:00
Craig Topper	c8e64ab539	[X86] Use regular expressions more aggressively to reduce the number of scheduler entries needed for FMA3 instructions. When the scheduler tables are generated by tablegen, the instructions are divided up into groups based on their default scheduling information and how they are referenced by groups for each processor. For any set of instructions that are matched by a specific InstRW line, that group of instructions is guaranteed to not be in a group with any other instructions. So in general, the more InstRW class definitions are created, the more groups we end up with in the generated files. Particularly if a lot of the InstRW lines only match to single instructions, which is true of a large number of the Intel scheduler models. This change alone reduces the number of instructions groups from ~6000 to ~5500. And there's lots more we could do. llvm-svn: 320461	2017-12-12 08:17:04 +00:00
Craig Topper	a0be5a06c1	[X86] Rename some instructions that start with Int_ to have the _Int at the end. This matches AVX512 version and is more consistent overall. And improves our scheduler models. In some cases this adds _Int to instructions that didn't have any Int_ before. It's a side effect of the adjustments made to some of the multiclasses. llvm-svn: 320325	2017-12-10 19:47:56 +00:00
Craig Topper	90c9c15936	[X86] Add MOVQI2PQIrm, MOVSDmr, and MOVSDrm to scheduler information The VEX versions were present but not the legacy SSE versions. llvm-svn: 320294	2017-12-10 09:14:44 +00:00
Craig Topper	28e55386ac	[X86] Add LEA64_32r to scheduler models for Sandybridge,Haswell,Broadwell,Skylake llvm-svn: 320293	2017-12-10 09:14:42 +00:00
Craig Topper	8ade4640f3	[X86] Add IN16/OUT16 to scheduling information for Haswell,Broadwell,Skylake Sandy Bridge is also missing it, but it has other issues. See PR35590. llvm-svn: 320292	2017-12-10 09:14:41 +00:00
Craig Topper	1a88c50fd7	[X86] Fix scheduler models to support ADD32ri in addition to ADD32ri8. Similar for all sizes of AND/OR/XOR/SUB/ADC/SBB/CMP. llvm-svn: 320291	2017-12-10 09:14:39 +00:00
Craig Topper	6c65910160	[X86] Add CMPSDrr/rm to the scheduler models. Somehow CMPSSrr/rm was there and the VEX version was there, but this was consistently missing. llvm-svn: 320289	2017-12-10 09:14:37 +00:00
Craig Topper	391c6f9507	[X86] Fix bad regular expressions in the scheduler models. Question marks should be outside of multicharacter parenthesized expressions If the question mark is inside the parentheses it only applies to the single character proceeding it. I had to make a few additional cleanups to fix some duplicate warnings that were exposed by fixing this. llvm-svn: 320279	2017-12-10 01:24:08 +00:00
Craig Topper	5ffe80103e	[X86] Add the commutable floating point min/max pseudo instructions to sandybridge,haswell,broadwell,skylakeclient scheduler models. llvm-svn: 320277	2017-12-10 01:24:05 +00:00
Gadi Haber	2cf601f28f	[X86][Haswell]: Updating the scheduling information for the Haswell subtarget. Updated the scheduling information for the Haswell subtarget with the following changes: Regrouped the instructions after adding appropriate load + store latencies. Added scheduling for missing instructions such as the GATHER instrs. The changes were made after revisiting the latencies impact of all memory uOps. Reviewers: RKSimon, zvi, craig.topper, apilipenko Differential Revision: https://reviews.llvm.org/D40021 Change-Id: Iaf6c1f5169add1552845a8a566af4e5a359217a7 llvm-svn: 320137	2017-12-08 09:48:44 +00:00
Simon Pilgrim	97160be53d	[X86][FMA] Tag all FMA/FMA4 instructions with WriteFMA schedule class As mentioned on PR17367, many instructions are missing scheduling tags preventing us from setting 'CompleteModel = 1' for better instruction analysis. This patch deals with FMA/FMA4 which is one of the bigger offenders (along with AVX512 in general). Annoyingly all scheduler models need to define WriteFMA (now that its actually used), even for older targets without FMA/FMA4 support, but that is an existing problem shared by other schedule classes. Differential Revision: https://reviews.llvm.org/D40351 llvm-svn: 319016	2017-11-27 10:41:32 +00:00
Craig Topper	c20b46da2f	[X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMem Summary: Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable. For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem. I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995. Reviewers: aymanmus, RKSimon, zvi Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38120 llvm-svn: 314639	2017-10-01 23:53:53 +00:00

1 2

91 Commits