llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	86e3c26924	[X86] Add FP comparison scheduler classes Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd Differential Revision: https://reviews.llvm.org/D45656 llvm-svn: 330179	2018-04-17 07:22:44 +00:00
Simon Pilgrim	e0c7868ded	Remove comment references to itineraries. NFCI. llvm-svn: 330021	2018-04-13 14:31:57 +00:00
Simon Pilgrim	963bf4de2b	Remove out of data comment. NFCI. llvm-svn: 330019	2018-04-13 14:24:06 +00:00
Simon Pilgrim	6551d405dc	[X86] Remove x86 InstrItinClass entries (PR37093) This removes the last of the x86 schedule itineraries, I'm intending to cleanup the remaining uses of NoItinerary/OpndItins/etc. before resolving PR37093. llvm-svn: 329967	2018-04-12 22:44:47 +00:00
Simon Pilgrim	577ae24feb	[X86] Remove explicit SSE/AVX schedule itineraries from defs (PR37093) llvm-svn: 329940	2018-04-12 19:25:07 +00:00
Simon Pilgrim	35935c0632	[X86] Remove remaining gpr schedule itineraries (PR37093) llvm-svn: 329938	2018-04-12 18:46:15 +00:00
Simon Pilgrim	dec781c141	[X86] Remove gpr shift/extension schedule itineraries (PR37093) llvm-svn: 329933	2018-04-12 18:25:38 +00:00
Simon Pilgrim	8904a86f65	[X86] Remove AES/CLMUL/CRC32/LDDQU/MOVNT/POPCNT/SHA schedule itineraries (PR37093) llvm-svn: 329912	2018-04-12 14:31:42 +00:00
Simon Pilgrim	294556d40e	[X86] Remove remaining system/special schedule itineraries (PR37093) llvm-svn: 329906	2018-04-12 12:43:49 +00:00
Simon Pilgrim	0cd0fbd8c5	[X86] Remove system/control schedule itineraries (PR37093) llvm-svn: 329903	2018-04-12 12:09:24 +00:00
Simon Pilgrim	69e0e8e3d4	[X86] Remove CMOV/SETCC schedule itineraries (PR37093) llvm-svn: 329898	2018-04-12 11:01:40 +00:00
Simon Pilgrim	10e3bdaaa8	[X86] Remove MMX/3DNow schedule itineraries (PR37093) llvm-svn: 329896	2018-04-12 10:49:57 +00:00
Simon Pilgrim	32d368147f	[X86] Remove X87 schedule itineraries (PR37093) First of a number of commits to remove x86 schedule itineraries entirely - approved off-line with @craig.topper llvm-svn: 329893	2018-04-12 10:27:37 +00:00
Simon Pilgrim	89c8a10f7c	[X86] Add variable shuffle schedule classes Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806	2018-04-11 13:49:19 +00:00
Craig Topper	b7baa358f6	[X86] Add SchedWrites for CMOV and SETCC. Use them to remove InstRWs. Summary: Cmov and setcc previously used WriteALU, but on Intel processors at least they are more restricted than basic ALU ops. This patch adds new SchedWrites for them and removes the InstRWs. I had to leave some InstRWs for CMOVA/CMOVBE and SETA/SETBE because those have an extra uop relative to the other condition codes on Intel CPUs. The test changes are due to fixing a missing ZnAGU dependency on the memory form of setcc. Reviewers: RKSimon, andreadb, GGanesh Reviewed By: RKSimon Subscribers: GGanesh, llvm-commits Differential Revision: https://reviews.llvm.org/D45380 llvm-svn: 329539	2018-04-08 17:53:18 +00:00
Craig Topper	f0d042619b	[X86] Attempt to model basic arithmetic instructions in the Haswell/Broadwell/Skylake scheduler models without InstRWs Summary: This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency. Apparently we were inconsistent about whether the store has latency or not thus the test changes. I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5. Reviewers: RKSimon, andreadb Reviewed By: andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45351 llvm-svn: 329416	2018-04-06 16:16:48 +00:00
Craig Topper	22d25a08ae	[X86] Merge itineraries for CLC, CMC, and STC. These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. llvm-svn: 329414	2018-04-06 16:16:43 +00:00
Craig Topper	13a0f83a05	[X86] Add SchedRW for PMULLD Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914	2018-03-31 04:54:32 +00:00
Craig Topper	89310f56c8	[X86] Correct the placement of ReadAfterLd in BEXTR and BZHI. Add dedicated SchedRW for BEXTR/BZHI. These instructions have the memory operand before the register operand. So we need to put ReadDefault for all the load ops first. Then the ReadAfterLd Differential Revision: https://reviews.llvm.org/D44838 llvm-svn: 328823	2018-03-29 20:41:39 +00:00
Simon Pilgrim	a2f26788a3	[X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classes Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664	2018-03-27 20:38:54 +00:00
Simon Pilgrim	28e7bcbba6	[X86] Add WriteCRC32 scheduler class Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis. Differential Revision: https://reviews.llvm.org/D44647 llvm-svn: 328582	2018-03-26 21:06:14 +00:00
Simon Pilgrim	f33d905293	[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881) Give the bit count instructions their own scheduler classes instead of forcing them into existing classes. These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar). Differential Revision: https://reviews.llvm.org/D44879 llvm-svn: 328566	2018-03-26 18:19:28 +00:00
Craig Topper	5ccd87233f	[X86] Make the multiply and divide itineraries more consistent. Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage. We also used the MUL8 itinerary for MULX32/64 which was also weird. The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result. llvm-svn: 327866	2018-03-19 16:38:33 +00:00
Craig Topper	e9c99d32b3	[X6] Remove two unused InstrItinClass llvm-svn: 327819	2018-03-19 02:07:32 +00:00
Simon Pilgrim	fb7aa57bf1	[X86][SSE] Introduce Float/Vector WriteMove, WriteLoad and Writetore scheduler classes As discussed on D44428 and PR36726, this patch splits off WriteFMove/WriteVecMove, WriteFLoad/WriteVecLoad and WriteFStore/WriteVecStore scheduler classes to permit vectors to be handled separately from gpr/scalar types. I've minimised the diff here by only moving various basic SSE/AVX vector instructions across - we can fix the rest when called for. This does fix the MOVDQA vs MOVAPS/MOVAPD discrepancies mentioned on D44428. Differential Revision: https://reviews.llvm.org/D44471 llvm-svn: 327630	2018-03-15 14:45:30 +00:00
Simon Pilgrim	f00ea1b4cd	[X86] Add RDMSR/WRMSR, RDPMC + RDTSC/RDTSCP schedule tests Add missing RDTSCP itinerary llvm-svn: 320581	2017-12-13 14:22:04 +00:00
Craig Topper	57c2815cbe	[X86] Adjust tablegen includes so we can use Instructions in scheduler models instead of just instregexs. This separates the CPU specific scheduler model includes to occur after the instructions. Moves the instruction includes between the basic scheduler information and the CPU specific scheduler models. llvm-svn: 320313	2017-12-10 17:42:36 +00:00
Simon Pilgrim	91c159d841	[X86][AVX[ Tag VZEROALL/VZEROUPPER instructions scheduler classes llvm-svn: 320302	2017-12-10 12:26:35 +00:00
Simon Pilgrim	7e636cc419	[X86] Tag FS/GS BASE R/W instruction scheduler classes llvm-svn: 320264	2017-12-09 20:42:27 +00:00
Simon Pilgrim	42fcda9a6c	[X86][MPX] Tag MPX instructions scheduler classes Currently tagged these as system instructions, once we have uses for them (ASAN?) and they are faster we will need to improve on this. llvm-svn: 320173	2017-12-08 19:03:42 +00:00
Simon Pilgrim	1ddcae665e	[X86] Tag PKU/INVPCID/RDPID/SMAP/SMX/PTWRITE system instructions scheduler classes llvm-svn: 320158	2017-12-08 15:48:37 +00:00
Simon Pilgrim	a13271bcba	[X86][VMX] Tag VMX instructions scheduler classes Tagged all as system instructions llvm-svn: 320053	2017-12-07 15:57:32 +00:00
Simon Pilgrim	f1d599adb2	[X86] Tag LZCNT/TZCNT instructions scheduler classes Tagged as IMUL instructions for a reasonable approximation (ALU tends to be a lot faster) - POPCNT is currently tagged as FAdd which I think should be replaced with IMUL as well llvm-svn: 320051	2017-12-07 15:24:14 +00:00
Simon Pilgrim	6b7cd86ca7	[X86][SVM] Tag SVM instructions scheduler classes Tagged all as system instructions llvm-svn: 320047	2017-12-07 14:35:17 +00:00
Simon Pilgrim	60411d9a8c	[X86] Tag RDRAND/RDSEED instruction scheduler classes llvm-svn: 320045	2017-12-07 14:18:48 +00:00
Simon Pilgrim	65f805fe30	[X86][X87] Tag FCMOV instruction scheduler classes llvm-svn: 319804	2017-12-05 18:01:26 +00:00
Simon Pilgrim	bc8d0223fb	[X86][SSE] Remove unused IIC_SSE_CVT_PI2PS_RR/IIC_SSE_CVT_PI2PS_RM itineraries llvm-svn: 319634	2017-12-03 20:57:04 +00:00
Simon Pilgrim	0747a7e8c3	[X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classes Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175	2017-11-28 15:03:42 +00:00
Simon Pilgrim	fe6e92d517	[X86][3DNow] Add 3DNow! instruction itinerary and scheduling classes llvm-svn: 319005	2017-11-26 20:50:29 +00:00
Simon Pilgrim	f545bb6cae	[X86][MMX] Add IIC_MMX_MOVMSK instruction itinerary class llvm-svn: 318999	2017-11-26 17:56:07 +00:00
Gadi Haber	323f2e1715	[X86][Broadwell] Added the instruction scheduling information for the Broadwell CPU. Adding the scheduling information for the Browadwell (BDW) CPU target. This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target. We used the scheduling information retrieved from the Broadwell architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction. The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175. Performance fluctuations may be expected due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D39054 Change-Id: If6f799e5ff60e1091c8d43b05ea78c53581bae01 llvm-svn: 316492	2017-10-24 20:19:47 +00:00
Gadi Haber	684944b822	[X86][SKX] Adding the scheduling information for the SKX target. Adding the scheduling information for the SkylakeServer (SKX) target. This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613. Please expect some performance fluctuations due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper, chandlerc, aymanmu Differential Revision: https://reviews.llvm.org/D38443 Change-Id: I5c228fcc09e9e5a99b6116e62b356c4f9b971185 llvm-svn: 315175	2017-10-08 12:52:54 +00:00
Gadi Haber	6f8fbf4b86	[X86][Skylake] Adding the scheduling information for the SkylakeClient target This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879. Please expect some performance fluctuations due to code alignment effects. Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena Differential Revision: https://reviews.llvm.org/D37294 llvm-svn: 313613	2017-09-19 06:19:27 +00:00
Simon Pilgrim	3f24ff6130	[X86][SSE] Added missing PACKSS/PACKUS intrinsic schedules Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431). Checked on Agner that these actually match the UNPACK schedules, but better to include a separate class llvm-svn: 309701	2017-08-01 16:47:48 +00:00
Craig Topper	106b5b6856	AMD znver1 Initial Scheduler model Summary: This patch adds the following 1. Adds a skeleton scheduler model for AMD Znver1. 2. Introduces the znver1 execution units and pipes. 3. Caters the instructions based on the generic scheduler classes. 4. Further additions to the scheduler model with instruction itineraries will be carried out incrementally based on a. Instructions types b. Registers used 5. Since itineraries are not added based on instructions, throughput information are bound to change when incremental changes are added. 6. Scheduler testcases are modified accordingly to suit the new model. Patch by Ganesh Gopalasubramanian. With minor formatting tweaks from me. Reviewers: craig.topper, RKSimon Subscribers: javed.absar, shivaram, ddibyend, vprasad Differential Revision: https://reviews.llvm.org/D35293 llvm-svn: 308411	2017-07-19 02:45:14 +00:00
Andrew V. Tischenko	8cb1d0931f	Add scheduler classes to integer/float horizontal operations. This patch will close PR32801. Differential Revision: https://reviews.llvm.org/D33203 llvm-svn: 304986	2017-06-08 16:44:13 +00:00
Simon Pilgrim	99b925bdf3	[X86][LWP] Add llvm support for LWP instructions (reapplied). This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Reapplied - this time without changing line endings of existing files. Differential Revision: https://reviews.llvm.org/D32769 llvm-svn: 302041	2017-05-03 15:51:39 +00:00
Simon Pilgrim	a271c54324	Revert rL302028 due to accidental line ending changes. llvm-svn: 302038	2017-05-03 15:42:29 +00:00
Simon Pilgrim	b2e0464fde	[X86][LWP] Add llvm support for LWP instructions. This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Differential Revision: https://reviews.llvm.org/D32769 llvm-svn: 302028	2017-05-03 15:18:34 +00:00
Craig Topper	50f3d1452c	[X86] Clzero intrinsic and its addition under znver1 This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 llvm-svn: 294558	2017-02-09 04:27:34 +00:00

1 2

85 Commits