llvm-project/llvm/lib/Target/X86/MCTargetDesc/X86IntelInstPrinter.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

455 lines
16 KiB
C++
Raw Normal View History

2012-10-04 03:00:20 +08:00
//===-- X86IntelInstPrinter.cpp - Intel assembly instruction printing -----===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
2012-10-04 03:00:20 +08:00
// This file includes code for rendering MCInst instances as Intel-style
// assembly.
//
//===----------------------------------------------------------------------===//
#include "X86IntelInstPrinter.h"
#include "X86BaseInfo.h"
#include "X86InstComments.h"
#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCInst.h"
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
#include "llvm/MC/MCInstrAnalysis.h"
#include "llvm/MC/MCInstrDesc.h"
#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"
#include <cassert>
#include <cstdint>
using namespace llvm;
#define DEBUG_TYPE "asm-printer"
// Include the auto-generated portion of the assembly writer.
#define PRINT_ALIAS_INSTR
#include "X86GenAsmWriter1.inc"
void X86IntelInstPrinter::printRegName(raw_ostream &OS, unsigned RegNo) const {
OS << getRegisterName(RegNo);
}
void X86IntelInstPrinter::printInst(const MCInst *MI, uint64_t Address,
StringRef Annot, const MCSubtargetInfo &STI,
raw_ostream &OS) {
printInstFlags(MI, OS);
// In 16-bit mode, print data16 as data32.
if (MI->getOpcode() == X86::DATA16_PREFIX &&
STI.getFeatureBits()[X86::Mode16Bit]) {
OS << "\tdata32";
} else if (!printAliasInstr(MI, Address, OS) && !printVecCompareInstr(MI, OS))
printInstruction(MI, Address, OS);
// Next always print the annotation.
printAnnotation(OS, Annot);
// If verbose assembly is enabled, we can print some informative comments.
if (CommentStream)
EmitAnyX86InstComments(MI, *CommentStream, MII);
}
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
2019-03-18 05:21:37 +08:00
bool X86IntelInstPrinter::printVecCompareInstr(const MCInst *MI, raw_ostream &OS) {
if (MI->getNumOperands() == 0 ||
!MI->getOperand(MI->getNumOperands() - 1).isImm())
return false;
int64_t Imm = MI->getOperand(MI->getNumOperands() - 1).getImm();
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
2019-03-18 05:21:37 +08:00
const MCInstrDesc &Desc = MII.get(MI->getOpcode());
// Custom print the vector compare instructions to get the immediate
// translated into the mnemonic.
switch (MI->getOpcode()) {
case X86::CMPPDrmi: case X86::CMPPDrri:
case X86::CMPPSrmi: case X86::CMPPSrri:
case X86::CMPSDrm: case X86::CMPSDrr:
case X86::CMPSDrm_Int: case X86::CMPSDrr_Int:
case X86::CMPSSrm: case X86::CMPSSrr:
case X86::CMPSSrm_Int: case X86::CMPSSrr_Int:
if (Imm >= 0 && Imm <= 7) {
OS << '\t';
printCMPMnemonic(MI, /*IsVCMP*/false, OS);
printOperand(MI, 0, OS);
OS << ", ";
// Skip operand 1 as its tied to the dest.
if ((Desc.TSFlags & X86II::FormMask) == X86II::MRMSrcMem) {
if ((Desc.TSFlags & X86II::OpPrefixMask) == X86II::XS)
printdwordmem(MI, 2, OS);
else if ((Desc.TSFlags & X86II::OpPrefixMask) == X86II::XD)
printqwordmem(MI, 2, OS);
else
printxmmwordmem(MI, 2, OS);
} else
printOperand(MI, 2, OS);
return true;
}
break;
case X86::VCMPPDrmi: case X86::VCMPPDrri:
case X86::VCMPPDYrmi: case X86::VCMPPDYrri:
case X86::VCMPPDZ128rmi: case X86::VCMPPDZ128rri:
case X86::VCMPPDZ256rmi: case X86::VCMPPDZ256rri:
case X86::VCMPPDZrmi: case X86::VCMPPDZrri:
case X86::VCMPPSrmi: case X86::VCMPPSrri:
case X86::VCMPPSYrmi: case X86::VCMPPSYrri:
case X86::VCMPPSZ128rmi: case X86::VCMPPSZ128rri:
case X86::VCMPPSZ256rmi: case X86::VCMPPSZ256rri:
case X86::VCMPPSZrmi: case X86::VCMPPSZrri:
case X86::VCMPSDrm: case X86::VCMPSDrr:
case X86::VCMPSDZrm: case X86::VCMPSDZrr:
case X86::VCMPSDrm_Int: case X86::VCMPSDrr_Int:
case X86::VCMPSDZrm_Int: case X86::VCMPSDZrr_Int:
case X86::VCMPSSrm: case X86::VCMPSSrr:
case X86::VCMPSSZrm: case X86::VCMPSSZrr:
case X86::VCMPSSrm_Int: case X86::VCMPSSrr_Int:
case X86::VCMPSSZrm_Int: case X86::VCMPSSZrr_Int:
case X86::VCMPPDZ128rmik: case X86::VCMPPDZ128rrik:
case X86::VCMPPDZ256rmik: case X86::VCMPPDZ256rrik:
case X86::VCMPPDZrmik: case X86::VCMPPDZrrik:
case X86::VCMPPSZ128rmik: case X86::VCMPPSZ128rrik:
case X86::VCMPPSZ256rmik: case X86::VCMPPSZ256rrik:
case X86::VCMPPSZrmik: case X86::VCMPPSZrrik:
case X86::VCMPSDZrm_Intk: case X86::VCMPSDZrr_Intk:
case X86::VCMPSSZrm_Intk: case X86::VCMPSSZrr_Intk:
case X86::VCMPPDZ128rmbi: case X86::VCMPPDZ128rmbik:
case X86::VCMPPDZ256rmbi: case X86::VCMPPDZ256rmbik:
case X86::VCMPPDZrmbi: case X86::VCMPPDZrmbik:
case X86::VCMPPSZ128rmbi: case X86::VCMPPSZ128rmbik:
case X86::VCMPPSZ256rmbi: case X86::VCMPPSZ256rmbik:
case X86::VCMPPSZrmbi: case X86::VCMPPSZrmbik:
case X86::VCMPPDZrrib: case X86::VCMPPDZrribk:
case X86::VCMPPSZrrib: case X86::VCMPPSZrribk:
case X86::VCMPSDZrrb_Int: case X86::VCMPSDZrrb_Intk:
case X86::VCMPSSZrrb_Int: case X86::VCMPSSZrrb_Intk:
if (Imm >= 0 && Imm <= 31) {
OS << '\t';
printCMPMnemonic(MI, /*IsVCMP*/true, OS);
unsigned CurOp = 0;
printOperand(MI, CurOp++, OS);
if (Desc.TSFlags & X86II::EVEX_K) {
// Print mask operand.
OS << " {";
printOperand(MI, CurOp++, OS);
OS << "}";
}
OS << ", ";
printOperand(MI, CurOp++, OS);
OS << ", ";
if ((Desc.TSFlags & X86II::FormMask) == X86II::MRMSrcMem) {
if (Desc.TSFlags & X86II::EVEX_B) {
// Broadcast form.
// Load size is based on W-bit.
if (Desc.TSFlags & X86II::VEX_W)
printqwordmem(MI, CurOp++, OS);
else
printdwordmem(MI, CurOp++, OS);
// Print the number of elements broadcasted.
unsigned NumElts;
if (Desc.TSFlags & X86II::EVEX_L2)
NumElts = (Desc.TSFlags & X86II::VEX_W) ? 8 : 16;
else if (Desc.TSFlags & X86II::VEX_L)
NumElts = (Desc.TSFlags & X86II::VEX_W) ? 4 : 8;
else
NumElts = (Desc.TSFlags & X86II::VEX_W) ? 2 : 4;
OS << "{1to" << NumElts << "}";
} else {
if ((Desc.TSFlags & X86II::OpPrefixMask) == X86II::XS)
printdwordmem(MI, CurOp++, OS);
else if ((Desc.TSFlags & X86II::OpPrefixMask) == X86II::XD)
printqwordmem(MI, CurOp++, OS);
else if (Desc.TSFlags & X86II::EVEX_L2)
printzmmwordmem(MI, CurOp++, OS);
else if (Desc.TSFlags & X86II::VEX_L)
printymmwordmem(MI, CurOp++, OS);
else
printxmmwordmem(MI, CurOp++, OS);
}
} else {
printOperand(MI, CurOp++, OS);
if (Desc.TSFlags & X86II::EVEX_B)
OS << ", {sae}";
}
return true;
}
break;
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
2019-03-18 05:21:37 +08:00
case X86::VPCOMBmi: case X86::VPCOMBri:
case X86::VPCOMDmi: case X86::VPCOMDri:
case X86::VPCOMQmi: case X86::VPCOMQri:
case X86::VPCOMUBmi: case X86::VPCOMUBri:
case X86::VPCOMUDmi: case X86::VPCOMUDri:
case X86::VPCOMUQmi: case X86::VPCOMUQri:
case X86::VPCOMUWmi: case X86::VPCOMUWri:
case X86::VPCOMWmi: case X86::VPCOMWri:
if (Imm >= 0 && Imm <= 7) {
OS << '\t';
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
2019-03-18 05:21:37 +08:00
printVPCOMMnemonic(MI, OS);
printOperand(MI, 0, OS);
OS << ", ";
printOperand(MI, 1, OS);
OS << ", ";
if ((Desc.TSFlags & X86II::FormMask) == X86II::MRMSrcMem)
printxmmwordmem(MI, 2, OS);
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
2019-03-18 05:21:37 +08:00
else
printOperand(MI, 2, OS);
return true;
}
break;
case X86::VPCMPBZ128rmi: case X86::VPCMPBZ128rri:
case X86::VPCMPBZ256rmi: case X86::VPCMPBZ256rri:
case X86::VPCMPBZrmi: case X86::VPCMPBZrri:
case X86::VPCMPDZ128rmi: case X86::VPCMPDZ128rri:
case X86::VPCMPDZ256rmi: case X86::VPCMPDZ256rri:
case X86::VPCMPDZrmi: case X86::VPCMPDZrri:
case X86::VPCMPQZ128rmi: case X86::VPCMPQZ128rri:
case X86::VPCMPQZ256rmi: case X86::VPCMPQZ256rri:
case X86::VPCMPQZrmi: case X86::VPCMPQZrri:
case X86::VPCMPUBZ128rmi: case X86::VPCMPUBZ128rri:
case X86::VPCMPUBZ256rmi: case X86::VPCMPUBZ256rri:
case X86::VPCMPUBZrmi: case X86::VPCMPUBZrri:
case X86::VPCMPUDZ128rmi: case X86::VPCMPUDZ128rri:
case X86::VPCMPUDZ256rmi: case X86::VPCMPUDZ256rri:
case X86::VPCMPUDZrmi: case X86::VPCMPUDZrri:
case X86::VPCMPUQZ128rmi: case X86::VPCMPUQZ128rri:
case X86::VPCMPUQZ256rmi: case X86::VPCMPUQZ256rri:
case X86::VPCMPUQZrmi: case X86::VPCMPUQZrri:
case X86::VPCMPUWZ128rmi: case X86::VPCMPUWZ128rri:
case X86::VPCMPUWZ256rmi: case X86::VPCMPUWZ256rri:
case X86::VPCMPUWZrmi: case X86::VPCMPUWZrri:
case X86::VPCMPWZ128rmi: case X86::VPCMPWZ128rri:
case X86::VPCMPWZ256rmi: case X86::VPCMPWZ256rri:
case X86::VPCMPWZrmi: case X86::VPCMPWZrri:
case X86::VPCMPBZ128rmik: case X86::VPCMPBZ128rrik:
case X86::VPCMPBZ256rmik: case X86::VPCMPBZ256rrik:
case X86::VPCMPBZrmik: case X86::VPCMPBZrrik:
case X86::VPCMPDZ128rmik: case X86::VPCMPDZ128rrik:
case X86::VPCMPDZ256rmik: case X86::VPCMPDZ256rrik:
case X86::VPCMPDZrmik: case X86::VPCMPDZrrik:
case X86::VPCMPQZ128rmik: case X86::VPCMPQZ128rrik:
case X86::VPCMPQZ256rmik: case X86::VPCMPQZ256rrik:
case X86::VPCMPQZrmik: case X86::VPCMPQZrrik:
case X86::VPCMPUBZ128rmik: case X86::VPCMPUBZ128rrik:
case X86::VPCMPUBZ256rmik: case X86::VPCMPUBZ256rrik:
case X86::VPCMPUBZrmik: case X86::VPCMPUBZrrik:
case X86::VPCMPUDZ128rmik: case X86::VPCMPUDZ128rrik:
case X86::VPCMPUDZ256rmik: case X86::VPCMPUDZ256rrik:
case X86::VPCMPUDZrmik: case X86::VPCMPUDZrrik:
case X86::VPCMPUQZ128rmik: case X86::VPCMPUQZ128rrik:
case X86::VPCMPUQZ256rmik: case X86::VPCMPUQZ256rrik:
case X86::VPCMPUQZrmik: case X86::VPCMPUQZrrik:
case X86::VPCMPUWZ128rmik: case X86::VPCMPUWZ128rrik:
case X86::VPCMPUWZ256rmik: case X86::VPCMPUWZ256rrik:
case X86::VPCMPUWZrmik: case X86::VPCMPUWZrrik:
case X86::VPCMPWZ128rmik: case X86::VPCMPWZ128rrik:
case X86::VPCMPWZ256rmik: case X86::VPCMPWZ256rrik:
case X86::VPCMPWZrmik: case X86::VPCMPWZrrik:
case X86::VPCMPDZ128rmib: case X86::VPCMPDZ128rmibk:
case X86::VPCMPDZ256rmib: case X86::VPCMPDZ256rmibk:
case X86::VPCMPDZrmib: case X86::VPCMPDZrmibk:
case X86::VPCMPQZ128rmib: case X86::VPCMPQZ128rmibk:
case X86::VPCMPQZ256rmib: case X86::VPCMPQZ256rmibk:
case X86::VPCMPQZrmib: case X86::VPCMPQZrmibk:
case X86::VPCMPUDZ128rmib: case X86::VPCMPUDZ128rmibk:
case X86::VPCMPUDZ256rmib: case X86::VPCMPUDZ256rmibk:
case X86::VPCMPUDZrmib: case X86::VPCMPUDZrmibk:
case X86::VPCMPUQZ128rmib: case X86::VPCMPUQZ128rmibk:
case X86::VPCMPUQZ256rmib: case X86::VPCMPUQZ256rmibk:
case X86::VPCMPUQZrmib: case X86::VPCMPUQZrmibk:
if ((Imm >= 0 && Imm <= 2) || (Imm >= 4 && Imm <= 6)) {
OS << '\t';
printVPCMPMnemonic(MI, OS);
unsigned CurOp = 0;
printOperand(MI, CurOp++, OS);
if (Desc.TSFlags & X86II::EVEX_K) {
// Print mask operand.
OS << " {";
printOperand(MI, CurOp++, OS);
OS << "}";
}
OS << ", ";
printOperand(MI, CurOp++, OS);
OS << ", ";
if ((Desc.TSFlags & X86II::FormMask) == X86II::MRMSrcMem) {
if (Desc.TSFlags & X86II::EVEX_B) {
// Broadcast form.
// Load size is based on W-bit as only D and Q are supported.
if (Desc.TSFlags & X86II::VEX_W)
printqwordmem(MI, CurOp++, OS);
else
printdwordmem(MI, CurOp++, OS);
// Print the number of elements broadcasted.
unsigned NumElts;
if (Desc.TSFlags & X86II::EVEX_L2)
NumElts = (Desc.TSFlags & X86II::VEX_W) ? 8 : 16;
else if (Desc.TSFlags & X86II::VEX_L)
NumElts = (Desc.TSFlags & X86II::VEX_W) ? 4 : 8;
else
NumElts = (Desc.TSFlags & X86II::VEX_W) ? 2 : 4;
OS << "{1to" << NumElts << "}";
} else {
if (Desc.TSFlags & X86II::EVEX_L2)
printzmmwordmem(MI, CurOp++, OS);
else if (Desc.TSFlags & X86II::VEX_L)
printymmwordmem(MI, CurOp++, OS);
else
printxmmwordmem(MI, CurOp++, OS);
}
} else {
printOperand(MI, CurOp++, OS);
}
return true;
}
break;
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
2019-03-18 05:21:37 +08:00
}
return false;
}
void X86IntelInstPrinter::printOperand(const MCInst *MI, unsigned OpNo,
raw_ostream &O) {
const MCOperand &Op = MI->getOperand(OpNo);
if (Op.isReg()) {
printRegName(O, Op.getReg());
} else if (Op.isImm()) {
O << formatImm((int64_t)Op.getImm());
} else {
assert(Op.isExpr() && "unknown operand kind in printOperand");
O << "offset ";
Op.getExpr()->print(O, &MAI);
}
}
void X86IntelInstPrinter::printMemReference(const MCInst *MI, unsigned Op,
raw_ostream &O) {
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
// Do not print the exact form of the memory operand if it references a known
// binary object.
if (SymbolizeOperands && MIA) {
uint64_t Target;
if (MIA->evaluateBranch(*MI, 0, 0, Target))
return;
if (MIA->evaluateMemoryOperandAddress(*MI, 0, 0))
return;
}
const MCOperand &BaseReg = MI->getOperand(Op+X86::AddrBaseReg);
unsigned ScaleVal = MI->getOperand(Op+X86::AddrScaleAmt).getImm();
const MCOperand &IndexReg = MI->getOperand(Op+X86::AddrIndexReg);
const MCOperand &DispSpec = MI->getOperand(Op+X86::AddrDisp);
// If this has a segment register, print it.
printOptionalSegReg(MI, Op + X86::AddrSegmentReg, O);
O << '[';
bool NeedPlus = false;
if (BaseReg.getReg()) {
printOperand(MI, Op+X86::AddrBaseReg, O);
NeedPlus = true;
}
if (IndexReg.getReg()) {
if (NeedPlus) O << " + ";
if (ScaleVal != 1)
O << ScaleVal << '*';
printOperand(MI, Op+X86::AddrIndexReg, O);
NeedPlus = true;
}
2012-10-04 03:00:20 +08:00
if (!DispSpec.isImm()) {
if (NeedPlus) O << " + ";
assert(DispSpec.isExpr() && "non-immediate displacement for LEA?");
DispSpec.getExpr()->print(O, &MAI);
} else {
int64_t DispVal = DispSpec.getImm();
if (DispVal || (!IndexReg.getReg() && !BaseReg.getReg())) {
if (NeedPlus) {
if (DispVal > 0)
O << " + ";
else {
O << " - ";
DispVal = -DispVal;
}
}
O << formatImm(DispVal);
}
}
O << ']';
}
void X86IntelInstPrinter::printSrcIdx(const MCInst *MI, unsigned Op,
raw_ostream &O) {
// If this has a segment register, print it.
printOptionalSegReg(MI, Op + 1, O);
O << '[';
printOperand(MI, Op, O);
O << ']';
}
void X86IntelInstPrinter::printDstIdx(const MCInst *MI, unsigned Op,
raw_ostream &O) {
// DI accesses are always ES-based.
O << "es:[";
printOperand(MI, Op, O);
O << ']';
}
void X86IntelInstPrinter::printMemOffset(const MCInst *MI, unsigned Op,
raw_ostream &O) {
const MCOperand &DispSpec = MI->getOperand(Op);
// If this has a segment register, print it.
printOptionalSegReg(MI, Op + 1, O);
O << '[';
if (DispSpec.isImm()) {
O << formatImm(DispSpec.getImm());
} else {
assert(DispSpec.isExpr() && "non-immediate displacement?");
DispSpec.getExpr()->print(O, &MAI);
}
O << ']';
}
void X86IntelInstPrinter::printU8Imm(const MCInst *MI, unsigned Op,
raw_ostream &O) {
if (MI->getOperand(Op).isExpr())
return MI->getOperand(Op).getExpr()->print(O, &MAI);
O << formatImm(MI->getOperand(Op).getImm() & 0xff);
}
void X86IntelInstPrinter::printSTiRegOperand(const MCInst *MI, unsigned OpNo,
raw_ostream &OS) {
const MCOperand &Op = MI->getOperand(OpNo);
unsigned Reg = Op.getReg();
// Override the default printing to print st(0) instead st.
if (Reg == X86::ST0)
OS << "st(0)";
else
printRegName(OS, Reg);
}