llvm-project/llvm/lib/Target/X86/X86SchedPredicates.td

//===-- X86SchedPredicates.td - X86 Scheduling Predicates --*- tablegen -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file defines scheduling predicate definitions that are common to
// all X86 subtargets.
//
//===----------------------------------------------------------------------===//

// A predicate used to identify dependency-breaking instructions that clear the
// content of the destination register. Note that this predicate only checks if
// input registers are the same. This predicate doesn't make any assumptions on
// the expected instruction opcodes, because different processors may implement
// different zero-idioms.
def ZeroIdiomPredicate : CheckSameRegOperand<1, 2>;

// A predicate used to identify VPERM that have bits 3 and 7 of their mask set.
// On some processors, these VPERM instructions are zero-idioms.
def ZeroIdiomVPERMPredicate : CheckAll<[
  ZeroIdiomPredicate,
  CheckImmOperand<3, 0x88>
]>;

// A predicate used to check if a LEA instruction uses all three source
// operands: base, index, and offset.
def IsThreeOperandsLEAPredicate: CheckAll<[
  // isRegOperand(Base)
  CheckIsRegOperand<1>,
  CheckNot<CheckInvalidRegOperand<1>>,

  // isRegOperand(Index)
  CheckIsRegOperand<3>,
  CheckNot<CheckInvalidRegOperand<3>>,

  // hasLEAOffset(Offset)
  CheckAny<[
    CheckAll<[
      CheckIsImmOperand<4>,
      CheckNot<CheckZeroOperand<4>>
    ]>,
    CheckNonPortable<"MI.getOperand(4).isGlobal()">
  ]>
]>;

def LEACases : MCOpcodeSwitchCase<
    [LEA32r, LEA64r, LEA64_32r, LEA16r],
    MCReturnStatement<IsThreeOperandsLEAPredicate>
>;

// Used to generate the body of a TII member function.
def IsThreeOperandsLEABody :
    MCOpcodeSwitchStatement<[LEACases], MCReturnStatement<FalsePred>>;

// This predicate evaluates to true only if the input machine instruction is a
// 3-operands LEA.  Tablegen automatically generates a new method for it in
// X86GenInstrInfo.
def IsThreeOperandsLEAFn :
    TIIPredicate<"isThreeOperandsLEA", IsThreeOperandsLEABody>;
[X86][BtVer2] correctly model the latency/throughput of LEA instructions. This patch fixes the latency/throughput of LEA instructions in the BtVer2 scheduling model. On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of 1. That is because it uses one cycle of SAGU followed by 1cy of ALU1. An LEA with a "Scale" operand is also slow, and it has the same latency profile as the 3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e. RThrouhgput of 2.0). This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td. The tablegen backend (for instruction-info) expands that definition into this (file X86GenInstrInfo.inc): ``` static bool isThreeOperandsLEA(const MachineInstr &MI) { return ( ( MI.getOpcode() == X86::LEA32r \|\| MI.getOpcode() == X86::LEA64r \|\| MI.getOpcode() == X86::LEA64_32r \|\| MI.getOpcode() == X86::LEA16r ) && MI.getOperand(1).isReg() && MI.getOperand(1).getReg() != 0 && MI.getOperand(3).isReg() && MI.getOperand(3).getReg() != 0 && ( ( MI.getOperand(4).isImm() && MI.getOperand(4).getImm() != 0 ) \|\| (MI.getOperand(4).isGlobal()) ) ); } ``` A similar method is generated in the X86_MC namespace, and included into X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h). Back to the BtVer2 scheduling model: A new scheduling predicate named JSlowLEAPredicate now checks if either the instruction is a three-operands LEA, or it is an LEA with a Scale value different than 1. A variant scheduling class uses that new predicate to correctly select the appropriate latency profile. Differential Revision: https://reviews.llvm.org/D49436 llvm-svn: 337469 2018-07-20 00:42:15 +08:00			`//===-- X86SchedPredicates.td - X86 Scheduling Predicates --- tablegen --===//`
			`//`
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636 2019-01-19 16:50:56 +08:00			`// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.`
			`// See https://llvm.org/LICENSE.txt for license information.`
			`// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception`
[X86][BtVer2] correctly model the latency/throughput of LEA instructions. This patch fixes the latency/throughput of LEA instructions in the BtVer2 scheduling model. On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of 1. That is because it uses one cycle of SAGU followed by 1cy of ALU1. An LEA with a "Scale" operand is also slow, and it has the same latency profile as the 3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e. RThrouhgput of 2.0). This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td. The tablegen backend (for instruction-info) expands that definition into this (file X86GenInstrInfo.inc): ``` static bool isThreeOperandsLEA(const MachineInstr &MI) { return ( ( MI.getOpcode() == X86::LEA32r \|\| MI.getOpcode() == X86::LEA64r \|\| MI.getOpcode() == X86::LEA64_32r \|\| MI.getOpcode() == X86::LEA16r ) && MI.getOperand(1).isReg() && MI.getOperand(1).getReg() != 0 && MI.getOperand(3).isReg() && MI.getOperand(3).getReg() != 0 && ( ( MI.getOperand(4).isImm() && MI.getOperand(4).getImm() != 0 ) \|\| (MI.getOperand(4).isGlobal()) ) ); } ``` A similar method is generated in the X86_MC namespace, and included into X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h). Back to the BtVer2 scheduling model: A new scheduling predicate named JSlowLEAPredicate now checks if either the instruction is a three-operands LEA, or it is an LEA with a Scale value different than 1. A variant scheduling class uses that new predicate to correctly select the appropriate latency profile. Differential Revision: https://reviews.llvm.org/D49436 llvm-svn: 337469 2018-07-20 00:42:15 +08:00			`//`
			`//===----------------------------------------------------------------------===//`
			`//`
			`// This file defines scheduling predicate definitions that are common to`
			`// all X86 subtargets.`
			`//`
			`//===----------------------------------------------------------------------===//`

			`// A predicate used to identify dependency-breaking instructions that clear the`
			`// content of the destination register. Note that this predicate only checks if`
			`// input registers are the same. This predicate doesn't make any assumptions on`
			`// the expected instruction opcodes, because different processors may implement`
			`// different zero-idioms.`
			`def ZeroIdiomPredicate : CheckSameRegOperand<1, 2>;`

[X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions. This patch adds another variant class to identify zero-idiom VPERM2F128rr instructions. On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom. Differential Revision: https://reviews.llvm.org/D52663 llvm-svn: 343452 2018-10-01 18:35:13 +08:00			`// A predicate used to identify VPERM that have bits 3 and 7 of their mask set.`
			`// On some processors, these VPERM instructions are zero-idioms.`
			`def ZeroIdiomVPERMPredicate : CheckAll<[`
			`ZeroIdiomPredicate,`
			`CheckImmOperand<3, 0x88>`
			`]>;`

[MC][PredicateExpander] Extend the grammar to support simple switch and return statements. This patch introduces tablegen class MCStatement. Currently, an MCStatement can be either a return statement, or a switch statement. ``` MCStatement: MCReturnStatement MCOpcodeSwitchStatement ``` A MCReturnStatement expands to a return statement, and the boolean expression associated with the return statement is described by a MCInstPredicate. An MCOpcodeSwitchStatement is a switch statement where the condition is a check on the machine opcode. It allows the definition of multiple checks, as well as a default case. More details on the grammar implemented by these two new constructs can be found in the diff for TargetInstrPredicates.td. This patch makes it easier to read the body of auto-generated TargetInstrInfo predicates. In future, I plan to reuse/extend the MCStatement grammar to describe more complex target hooks. For now, this is just a first step (mostly a minor cosmetic change to polish the new predicates framework). Differential Revision: https://reviews.llvm.org/D50457 llvm-svn: 339352 2018-08-09 23:32:48 +08:00			`// A predicate used to check if a LEA instruction uses all three source`
			`// operands: base, index, and offset.`
[X86][BtVer2] correctly model the latency/throughput of LEA instructions. This patch fixes the latency/throughput of LEA instructions in the BtVer2 scheduling model. On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of 1. That is because it uses one cycle of SAGU followed by 1cy of ALU1. An LEA with a "Scale" operand is also slow, and it has the same latency profile as the 3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e. RThrouhgput of 2.0). This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td. The tablegen backend (for instruction-info) expands that definition into this (file X86GenInstrInfo.inc): ``` static bool isThreeOperandsLEA(const MachineInstr &MI) { return ( ( MI.getOpcode() == X86::LEA32r \|\| MI.getOpcode() == X86::LEA64r \|\| MI.getOpcode() == X86::LEA64_32r \|\| MI.getOpcode() == X86::LEA16r ) && MI.getOperand(1).isReg() && MI.getOperand(1).getReg() != 0 && MI.getOperand(3).isReg() && MI.getOperand(3).getReg() != 0 && ( ( MI.getOperand(4).isImm() && MI.getOperand(4).getImm() != 0 ) \|\| (MI.getOperand(4).isGlobal()) ) ); } ``` A similar method is generated in the X86_MC namespace, and included into X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h). Back to the BtVer2 scheduling model: A new scheduling predicate named JSlowLEAPredicate now checks if either the instruction is a three-operands LEA, or it is an LEA with a Scale value different than 1. A variant scheduling class uses that new predicate to correctly select the appropriate latency profile. Differential Revision: https://reviews.llvm.org/D49436 llvm-svn: 337469 2018-07-20 00:42:15 +08:00			`def IsThreeOperandsLEAPredicate: CheckAll<[`
			`// isRegOperand(Base)`
			`CheckIsRegOperand<1>,`
			`CheckNot<CheckInvalidRegOperand<1>>,`

			`// isRegOperand(Index)`
			`CheckIsRegOperand<3>,`
			`CheckNot<CheckInvalidRegOperand<3>>,`

			`// hasLEAOffset(Offset)`
			`CheckAny<[`
			`CheckAll<[`
			`CheckIsImmOperand<4>,`
			`CheckNot<CheckZeroOperand<4>>`
			`]>,`
			`CheckNonPortable<"MI.getOperand(4).isGlobal()">`
			`]>`
			`]>;`

[MC][PredicateExpander] Extend the grammar to support simple switch and return statements. This patch introduces tablegen class MCStatement. Currently, an MCStatement can be either a return statement, or a switch statement. ``` MCStatement: MCReturnStatement MCOpcodeSwitchStatement ``` A MCReturnStatement expands to a return statement, and the boolean expression associated with the return statement is described by a MCInstPredicate. An MCOpcodeSwitchStatement is a switch statement where the condition is a check on the machine opcode. It allows the definition of multiple checks, as well as a default case. More details on the grammar implemented by these two new constructs can be found in the diff for TargetInstrPredicates.td. This patch makes it easier to read the body of auto-generated TargetInstrInfo predicates. In future, I plan to reuse/extend the MCStatement grammar to describe more complex target hooks. For now, this is just a first step (mostly a minor cosmetic change to polish the new predicates framework). Differential Revision: https://reviews.llvm.org/D50457 llvm-svn: 339352 2018-08-09 23:32:48 +08:00			`def LEACases : MCOpcodeSwitchCase<`
			`[LEA32r, LEA64r, LEA64_32r, LEA16r],`
			`MCReturnStatement<IsThreeOperandsLEAPredicate>`
			`>;`

			`// Used to generate the body of a TII member function.`
			`def IsThreeOperandsLEABody :`
			`MCOpcodeSwitchStatement<[LEACases], MCReturnStatement<FalsePred>>;`

[X86][BtVer2] correctly model the latency/throughput of LEA instructions. This patch fixes the latency/throughput of LEA instructions in the BtVer2 scheduling model. On Jaguar, A 3-operands LEA has a latency of 2cy, and a reciprocal throughput of 1. That is because it uses one cycle of SAGU followed by 1cy of ALU1. An LEA with a "Scale" operand is also slow, and it has the same latency profile as the 3-operands LEA. An LEA16r has a latency of 3cy, and a throughput of 0.5 (i.e. RThrouhgput of 2.0). This patch adds a new TIIPredicate named IsThreeOperandsLEAFn to X86Schedule.td. The tablegen backend (for instruction-info) expands that definition into this (file X86GenInstrInfo.inc): ``` static bool isThreeOperandsLEA(const MachineInstr &MI) { return ( ( MI.getOpcode() == X86::LEA32r \|\| MI.getOpcode() == X86::LEA64r \|\| MI.getOpcode() == X86::LEA64_32r \|\| MI.getOpcode() == X86::LEA16r ) && MI.getOperand(1).isReg() && MI.getOperand(1).getReg() != 0 && MI.getOperand(3).isReg() && MI.getOperand(3).getReg() != 0 && ( ( MI.getOperand(4).isImm() && MI.getOperand(4).getImm() != 0 ) \|\| (MI.getOperand(4).isGlobal()) ) ); } ``` A similar method is generated in the X86_MC namespace, and included into X86MCTargetDesc.cpp (the declaration lives in X86MCTargetDesc.h). Back to the BtVer2 scheduling model: A new scheduling predicate named JSlowLEAPredicate now checks if either the instruction is a three-operands LEA, or it is an LEA with a Scale value different than 1. A variant scheduling class uses that new predicate to correctly select the appropriate latency profile. Differential Revision: https://reviews.llvm.org/D49436 llvm-svn: 337469 2018-07-20 00:42:15 +08:00			`// This predicate evaluates to true only if the input machine instruction is a`
			`// 3-operands LEA. Tablegen automatically generates a new method for it in`
			`// X86GenInstrInfo.`
			`def IsThreeOperandsLEAFn :`
[Tablegen][MCInstPredicate] Removed redundant template argument from class TIIPredicate, and implemented verification rules for TIIPredicates. This patch removes redundant template argument `TargetName` from TIIPredicate. Tablegen can always infer the target name from the context. So we don't need to force users of TIIPredicate to always specify it. This allows us to better modularize the tablegen class hierarchy for the so-called "function predicates". class FunctionPredicateBase has been added; it is currently used as a building block for TIIPredicates. However, I plan to reuse that class to model other function predicate classes too (i.e. not just TIIPredicates). For example, this can be a first step towards implementing proper support for dependency breaking instructions in tablegen. This patch also adds a verification step on TIIPredicates in tablegen. We cannot have multiple TIIPredicates with the same name. Otherwise, this will cause build errors later on, when tablegen'd .inc files are included by cpp files and then compiled. Differential Revision: https://reviews.llvm.org/D50708 llvm-svn: 339706 2018-08-15 02:36:54 +08:00			`TIIPredicate<"isThreeOperandsLEA", IsThreeOperandsLEABody>;`