Commit Graph

123 Commits

Author SHA1 Message Date
Simon Pilgrim f09fc3bc12 [X86] Move ReadAfterLd functionality into X86FoldableSchedWrite (PR36957)
Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957).

This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level.

I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place.

Differential Revision: https://reviews.llvm.org/D52886

llvm-svn: 343868
2018-10-05 17:57:29 +00:00
Simon Pilgrim 00865a48d1 [X86] Split WriteIMul into 8/16/32/64 implementations (PR36931)
Split WriteIMul by size and also by IMUL multiply-by-imm and multiply-by-reg cases.

This removes all the scheduler overrides for gpr multiplies and stops WriteMULH being ignored for BMI2 MULX instructions.

llvm-svn: 342892
2018-09-24 15:21:57 +00:00
Craig Topper 2262613532 [X86] Remove isel patterns for ADCX instruction
There's no advantage to this instruction unless you need to avoid touching other flag bits. It's encoding is longer, it can't fold an immediate, it doesn't write all the flags.

I don't think gcc will generate this instruction either.

Fixes PR38852.

Differential Revision: https://reviews.llvm.org/D51754

llvm-svn: 342059
2018-09-12 15:47:34 +00:00
Craig Topper a2c9694bc8 [X86] Mark the ADCX and ADOX instruction as commutable.
llvm-svn: 341752
2018-09-08 18:47:56 +00:00
Craig Topper c96305970d [X86] Add commuted isel pattern for the load form of ADCX instructions.
This prevents the legacy ADC instruction from being favored over ADCX when the load is in the operand 0.

llvm-svn: 341745
2018-09-08 06:31:43 +00:00
Craig Topper 2c9dede9cb [X86] Add RMW ADC patterns with load in operand 1.
ADC is commutable and the load could be in either operand, but we were only checking operand 0.

Ideally we'd mark X86adc_flag as commutable and tablegen would automatically do this, but the EFLAGS register mention is preventing it.

llvm-svn: 341606
2018-09-06 23:55:36 +00:00
Craig Topper 0fd5cdee3a [X86] Add isel patterns for commuting X86adc_flag with a load in the LHS.
The peephole pass likely gets this normally, but we should be doing it during isel.

Ideally we'd just make the X86adc_flag pattern SDNPCommutable, but the tablegen doesn't handle that when one of the operands is a register reference.

llvm-svn: 341596
2018-09-06 22:41:44 +00:00
Craig Topper 04ded1ac1f [X86] Remove AddedComplexity from register form of NOT. NFCI
I believe isProfitableToFold will stop the load folding that this was intended to overcome.

Given an (xor load, -1), isProfitableToFold will see that the immediate can be folded with the xor using a one byte immediate since it can be sign extended. It doesn't know about NOT, but the one byte immediate check is enough to stop the fold.

llvm-svn: 336712
2018-07-10 19:09:00 +00:00
Simon Pilgrim 0c0336e003 [X86] Split WriteADC/WriteADCRMW scheduler classes
For integer ALU instructions taking eflags as an input (ADC/SBB/ADCX/ADOX)

llvm-svn: 332605
2018-05-17 12:43:42 +00:00
Simon Pilgrim 2864b46469 [X86] Split off WriteIMul64 from WriteIMul schedule class (PR36931)
This fixes a couple of BtVer2 missing instructions that weren't been handled in the override.

NOTE: There are still a lot of overrides that still need cleaning up!
llvm-svn: 331770
2018-05-08 14:55:16 +00:00
Simon Pilgrim 2580554333 [X86] Split WriteIDiv into div/idiv 8/16/32/64 implementations (PR36930)
I've created the necessary classes but there are still a lot of overrides that need cleaning up.

NOTE: The Znver1 model was missing some div/idiv variants in the instregex patterns and wasn't setting the resource cycles at all in the overrides.
llvm-svn: 331767
2018-05-08 13:51:45 +00:00
Simon Pilgrim 35935c0632 [X86] Remove remaining gpr schedule itineraries (PR37093)
llvm-svn: 329938
2018-04-12 18:46:15 +00:00
Craig Topper f0d042619b [X86] Attempt to model basic arithmetic instructions in the Haswell/Broadwell/Skylake scheduler models without InstRWs
Summary:
This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency.

Apparently we were inconsistent about whether the store has latency or not thus the test changes.

I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5.

Reviewers: RKSimon, andreadb

Reviewed By: andreadb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45351

llvm-svn: 329416
2018-04-06 16:16:48 +00:00
Craig Topper a3cac956fc [X86] Use loadi16/loadi32 predicates in multiply patterns
llvm-svn: 329153
2018-04-04 07:00:19 +00:00
Craig Topper dc4a6d1ef6 [X86] Cleanup ADCX/ADOX instruction definitions.
Give them both the same itineraries. Add hasSideEffects = 0 to ADOX since they don't have patterns. Rename source operands to $src1 and $src2 instead of $src0 and $src. Add ReadAfterLd to the memory form SchedRW.

llvm-svn: 328952
2018-04-01 23:58:50 +00:00
Chandler Carruth 4244625c51 [x86] Correct the operand structure of the ADOX instruction.
This also moves to define it in the same way as ADCX which seems to use
constraints a bit better.

This is pulled out of the review for reducing the use of popf for
restoring EFLAGS, but is independent. There are still more problems with
our definitions for these instructions that Craig is going to look at
but this is at least less broken and he can start from this to improve
them more fully.

Thanks to Craig for the review here.

llvm-svn: 328945
2018-04-01 21:53:18 +00:00
Craig Topper 4778fa7e8a [X86] Fix the SchedRW for memory forms of CMP and TEST.
They were incorrectly marked as RMW operations. Some of the CMP instrucions worked, but the ones that use a similar encoding as RMW form of ADD ended up marked as RMW.

TEST used the same tablegen class as some of the CMPs.

llvm-svn: 327947
2018-03-20 03:55:17 +00:00
Craig Topper 5ccd87233f [X86] Make the multiply and divide itineraries more consistent.
Sometimes we used the same itinerary for MEM and REG forms, but that seems inconsistent with our usual usage.

We also used the MUL8 itinerary for MULX32/64 which was also weird.

The test changes are because we were using IIC_IMUL32_RR and IIC_IMUL64_RR instead of IIC_IMUL32_REG/IIC_IMUL64_REG for the 32 and 64 bit multiplies that produce double width result.

llvm-svn: 327866
2018-03-19 16:38:33 +00:00
Craig Topper 98ae8f833f [X86] Change some compare patterns to use loadi8/loadi16/loadi32/loadi64 helper fragments.
This enables CMP8mi to fold zextloadi8i1 which in all tests allows us to avoid creating a TEST8rr that peephole can't fold.

llvm-svn: 324863
2018-02-12 02:48:42 +00:00
Craig Topper 9a06f24704 [X86] Artificially lower the complexity of the scalar ANDN patterns so that AND with immediate will match first.
This allows the immediate to folded into the and instead of being forced to move into a register. This can sometimes result in shorter encodings since the and can sign extend an immediate.

This also allows us to match an and to a movzx after a not.

This can cause an extra move if the input to the separate NOT has an additional user which requires a copy before the NOT.

llvm-svn: 324260
2018-02-05 18:31:04 +00:00
Amaury Sechet f89f188ddb [X86] Avoid using high register trick for test instruction
Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.

Reviewers: craig.topper, niravd, spatel, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42646

llvm-svn: 323888
2018-01-31 16:48:54 +00:00
Eric Liu 0b69b5ed85 Revert "[X86] Avoid using high register trick for test instruction"
This reverts commit r323690. This causes crash in llc. See the original commit thread for details.

llvm-svn: 323761
2018-01-30 14:18:33 +00:00
Amaury Sechet 015184b79e [X86] Avoid using high register trick for test instruction
Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.

The main noteworthy regression I was able to observe was pattern of the type (setcc (trunc (and X, C)), 0) where C is such as it would benefit from the hi register trick. To prevent this, a new pattern is added to materialize such pattern using a 32 bits test. This has the added benefit of working with any constant that is materializable as a 32bits immediate, not just the ones that can leverage the high register trick, as demonstrated by the test case in test-shrink.ll using the constant 2049 .

Reviewers: craig.topper, niravd, spatel, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42646

llvm-svn: 323690
2018-01-29 20:54:33 +00:00
Craig Topper 23c348850f [X86] Add 'Requires<[In64BitMode]>' to a bunch of instructions that only have memory and immediate operands.
The asm parser wasn't preventing these from being accepted in 32-bit mode. Instructions that use a GR64 register are protected by the parser rejecting the register in 32-bit mode.

llvm-svn: 320846
2017-12-15 19:01:51 +00:00
Craig Topper c20b46da2f [X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMem
Summary:
Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable.

For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem.

I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995.

Reviewers: aymanmus, RKSimon, zvi

Reviewed By: aymanmus

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38120

llvm-svn: 314639
2017-10-01 23:53:53 +00:00
Craig Topper d1252692a4 [X86] Remove unused tablegen class.
llvm-svn: 313861
2017-09-21 04:55:06 +00:00
Craig Topper 3be1db82b6 [X86] Don't disable slow INC/DEC if optimizing for size
Summary:
Just because INC/DEC is a little slow on some processors doesn't mean we shouldn't prefer it when optimizing for size.

This appears to match gcc behavior.

Reviewers: chandlerc, zvi, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37177

llvm-svn: 312866
2017-09-09 17:11:59 +00:00
Craig Topper 69ec201848 [X86] Qualify the RMW INC/DEC patterns with NotSlowIncDec.
We were suppressing most uses of INC/DEC, but this one seems to have been missed.

llvm-svn: 311828
2017-08-26 06:24:25 +00:00
Ayman Musa 0b4f97d5e9 [X86] Adding FoldGenRegForm helper field (for memory folding tables tableGen backend) to X86Inst class and set its value for the relevant instructions.
Some register-register instructions can be encoded in 2 different ways, this happens when 2 register operands can be folded (separately). 
For example if we look at the MOV8rr and MOV8rr_REV, both instructions perform exactly the same operation, but are encoded differently. Here is the relevant information about these instructions from Intel's 64-ia-32-architectures-software-developer-manual:

Opcode  Instruction  Op/En  64-Bit Mode  Compat/Leg Mode  Description
8A /r   MOV r8,r/m8  RM     Valid        Valid            Move r/m8 to r8.
88 /r   MOV r/m8,r8  MR     Valid        Valid            Move r8 to r/m8.
Here we can see that in order to enable the folding of the output and input registers, we had to define 2 "encodings", and as a result we got 2 move 8-bit register-register instructions.

In the X86 backend, we define both of these instructions, usually one has a regular name (MOV8rr) while the other has "_REV" suffix (MOV8rr_REV), must be marked with isCodeGenOnly flag and is not emitted from CodeGen.

Automatically generating the memory folding tables relies on matching encodings of instructions, but in these cases where we want to map both memory forms of the mov 8-bit (MOV8rm & MOV8mr) to MOV8rr (not to MOV8rr_REV) we have to somehow point from the MOV8rr_REV to the "regular" appropriate instruction which in this case is MOV8rr.

This field enable this "pointing" mechanism - which is used in the TableGen backend for generating memory folding tables.

Differential Revision: https://reviews.llvm.org/D32683

llvm-svn: 304087
2017-05-28 12:39:37 +00:00
Ayman Musa 11966ab00b [X86] Add missing mayLoad/mayStore attributes to some X86 instructions (Continue)
Complete the patch committed in rL300190.

Differential Revision: https://reviews.llvm.org/D32287

llvm-svn: 301393
2017-04-26 11:34:09 +00:00
Sanjay Patel 904cd39b05 [x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants.
This patch handles 64-bit constants which can be encoded as 32-bit immediates.

It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants.

Patch by Sunita Marathe!

Differential Revision: https://reviews.llvm.org/D23391

llvm-svn: 278857
2016-08-16 21:35:16 +00:00
Craig Topper fcc34bdee0 [X86] Fix CMP and TEST with al/ax/eax/rax to not mark EFLAGS as a use or al/ax/eax/rax as a def. Probably doesn't have a functional affect since these aren't used in isel.
llvm-svn: 249994
2015-10-11 19:54:02 +00:00
Michael Kuperstein 243c073a2e [X86] Allow merging of immediates within a basic block for code size savings
First step in preventing immediates that occur more than once within a single
basic block from being pulled into their users, in order to prevent unnecessary
large instruction encoding .Currently enabled only when optimizing for size.

Patch by: zia.ansari@intel.com
Differential Revision: http://reviews.llvm.org/D11363

llvm-svn: 244601
2015-08-11 14:10:58 +00:00
Rafael Espindola dd3add6c60 Fix the operand encoding in the test instruction.
Fixes pr22995.

llvm-svn: 233686
2015-03-31 12:31:55 +00:00
Craig Topper 7c10252943 [X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the LEA variants in Intel syntax. The memory operand is inherently unsized.
llvm-svn: 225432
2015-01-08 07:41:30 +00:00
Craig Topper ddbf51f904 [X86] Make isel select the 2-byte register form of INC/DEC even in non-64-bit mode. Convert to the 1-byte form in non-64-bit mode as part of MCInst lowering.
Overall this seems simpler. It reduces duplication of patterns between both modes and it simplifies the memory folding/unfolding tables as they don't need to create fake instructions just to keep track of 64-bitness.

llvm-svn: 225252
2015-01-06 07:35:50 +00:00
Craig Topper dc2fc8035d [X86] Remove the predicates from the register forms of the 2-byte inc and dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates.
llvm-svn: 225157
2015-01-05 08:19:12 +00:00
Craig Topper 31d6d9a0fb [X86] Fix some cases where some 8-bit instructions were marked as being convertible to three address instructions, but aren't really.
llvm-svn: 224940
2014-12-29 16:25:26 +00:00
Craig Topper 874a1966ae [X86] Add the 0x82 instructions to the disassebmler. They are identical in functionality to the 0x80 opcode instructions, but are not valid in 64-bit mode.
llvm-svn: 224939
2014-12-29 16:25:23 +00:00
Craig Topper c51b7993b8 [x86] Refactor some tablegen instruction info classes slightly to prepare for another change. NFC.
llvm-svn: 224938
2014-12-29 16:25:22 +00:00
Craig Topper 56c8e05c24 [x86] Remove unused classes from tablegen instruction info.
llvm-svn: 224937
2014-12-29 16:25:19 +00:00
Craig Topper 2e2aee0cd6 [X86] Remove unnecessary 'In64BitMode' predicate for instructions that already indicate use of REX.W.
llvm-svn: 224495
2014-12-18 05:02:08 +00:00
Michael Liao 5bf9578ce4 [X86] Clean up whitespace as well as minor coding style
llvm-svn: 223339
2014-12-04 05:20:33 +00:00
Craig Topper c50d64b07b Replace neverHasSideEffects=1 with hasSideEffects=0 in all .td files.
llvm-svn: 222801
2014-11-26 00:46:26 +00:00
Robert Khasanov 7c5a843646 [x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32|64} intrinsics to LLVM.
llvm-svn: 216162
2014-08-21 09:27:00 +00:00
Akira Hatanaka 7cc27649a6 [X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0.
Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable
macro-fusion.

<rdar://problem/15680770>

llvm-svn: 212747
2014-07-10 18:00:53 +00:00
Craig Topper 5ccb61781f Add an x86 prefix encoding for instructions that would decode to a different instruction with 0xf2/f3/66 were in front of them, but don't themselves have a prefix. For now this doesn't change any bbehavior, but plan to use it to fix some bugs in the disassembler.
llvm-svn: 201538
2014-02-18 00:21:49 +00:00
Craig Topper fa6298a162 Merge x86 HasOpSizePrefix/HasOpSize16Prefix into a 2-bit OpSize field with 0 meaning no 0x66 prefix in any mode. Rename Opsize16->OpSize32 and OpSize->OpSize16. The classes now refer to their operand size rather than the mode in which they need a 0x66 prefix. Hopefully can merge REX_W into this as OpSize64.
llvm-svn: 200626
2014-02-02 09:25:09 +00:00
David Woodhouse 0b6c94909e [x86] Fix signed relocations for i64i32imm operands
These should end up (in ELF) as R_X86_64_32S relocs, not R_X86_64_32.
Kill the horrid and incomplete special case and FIXME in
EncodeInstruction() and set things up so it can infer the signedness
from the ImmType just like it can the size and whether it's PC-relative.

llvm-svn: 200495
2014-01-30 22:20:41 +00:00
Craig Topper 80ab268b06 Switch a few instructions to use RI instead I so they don't require REX_W to be explicitly specified.
llvm-svn: 199479
2014-01-17 08:16:57 +00:00