2011-01-20 14:39:06 +08:00
|
|
|
//===-- llvm-objdump.cpp - Object file dumping utility for llvm -----------===//
|
|
|
|
//
|
2019-01-19 16:50:56 +08:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
2011-01-20 14:39:06 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
//
|
|
|
|
// This program is a utility that works like binutils "objdump", that is, it
|
|
|
|
// dumps out a plethora of information about an object file depending on the
|
|
|
|
// flags.
|
|
|
|
//
|
2013-02-06 04:27:22 +08:00
|
|
|
// The flags and output of this program should be near identical to those of
|
|
|
|
// binutils objdump.
|
|
|
|
//
|
2011-01-20 14:39:06 +08:00
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
2011-09-20 01:56:04 +08:00
|
|
|
#include "llvm-objdump.h"
|
[llvm-objdump][COFF][NFC] Split format-specific interfaces; add namespace
Summary:
This patch addresses, for the interfaces implemented by `COFFDump.cpp`,
multiple issues identified with the current structure of
`llvm-objdump.h` in the review of D72973.
This patch moves implementation details of the tool into an
`llvm::objdump` namespace for external linkage names, splits the
implementation details into separate headers for each implementation
file, and uses qualified names when declaring members of the
`llvm::objdump` namespace in place of leaving the namespace definition
open.
Reviewers: jhenderson, DiggerLin, jasonliu, daltenty, MaskRay
Reviewed By: jhenderson, MaskRay
Subscribers: MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77285
2020-04-03 06:17:52 +08:00
|
|
|
#include "COFFDump.h"
|
2020-04-24 09:20:45 +08:00
|
|
|
#include "ELFDump.h"
|
2020-04-07 04:56:13 +08:00
|
|
|
#include "MachODump.h"
|
2020-04-15 06:24:37 +08:00
|
|
|
#include "WasmDump.h"
|
2020-04-06 22:09:12 +08:00
|
|
|
#include "XCOFFDump.h"
|
2020-03-17 22:21:42 +08:00
|
|
|
#include "llvm/ADT/IndexedMap.h"
|
2015-06-23 02:03:02 +08:00
|
|
|
#include "llvm/ADT/Optional.h"
|
2020-03-17 22:21:42 +08:00
|
|
|
#include "llvm/ADT/SmallSet.h"
|
2012-12-04 18:44:52 +08:00
|
|
|
#include "llvm/ADT/STLExtras.h"
|
2019-06-08 04:34:31 +08:00
|
|
|
#include "llvm/ADT/SetOperations.h"
|
2011-10-18 01:13:22 +08:00
|
|
|
#include "llvm/ADT/StringExtras.h"
|
2018-03-10 03:13:44 +08:00
|
|
|
#include "llvm/ADT/StringSet.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/ADT/Triple.h"
|
2021-01-07 20:34:11 +08:00
|
|
|
#include "llvm/ADT/Twine.h"
|
2016-01-26 23:09:42 +08:00
|
|
|
#include "llvm/DebugInfo/DWARF/DWARFContext.h"
|
2016-08-16 03:49:24 +08:00
|
|
|
#include "llvm/DebugInfo/Symbolize/Symbolize.h"
|
2018-07-19 00:39:21 +08:00
|
|
|
#include "llvm/Demangle/Demangle.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/MC/MCAsmInfo.h"
|
Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
in the X86 and ARM disassemblers to symbolize immediate operands and
to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
translate relocations (either object::RelocationRef, or disassembler
C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
finds in an object::ObjectFile. This makes simple symbolization (with
no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
support the C API VariantKinds.
Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations: call _foo-_bar; call _foo-4
- __cf?string: leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).
As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.
I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).
llvm-svn: 182625
2013-05-24 08:39:57 +08:00
|
|
|
#include "llvm/MC/MCContext.h"
|
2016-01-27 00:44:37 +08:00
|
|
|
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
|
|
|
|
#include "llvm/MC/MCDisassembler/MCRelocationInfo.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/MC/MCInst.h"
|
|
|
|
#include "llvm/MC/MCInstPrinter.h"
|
MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
disassembler. It first builds an atom for each section. It can also
construct the CFG, and this splits the text atoms into basic blocks.
MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.
In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).
This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.
Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.
llvm-svn: 182628
2013-05-24 09:07:04 +08:00
|
|
|
#include "llvm/MC/MCInstrAnalysis.h"
|
2012-04-02 14:09:36 +08:00
|
|
|
#include "llvm/MC/MCInstrInfo.h"
|
Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
in the X86 and ARM disassemblers to symbolize immediate operands and
to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
translate relocations (either object::RelocationRef, or disassembler
C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
finds in an object::ObjectFile. This makes simple symbolization (with
no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
support the C API VariantKinds.
Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations: call _foo-_bar; call _foo-4
- __cf?string: leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).
As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.
I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).
llvm-svn: 182625
2013-05-24 08:39:57 +08:00
|
|
|
#include "llvm/MC/MCObjectFileInfo.h"
|
2012-03-06 03:33:20 +08:00
|
|
|
#include "llvm/MC/MCRegisterInfo.h"
|
MC: Disassembled CFG reconstruction.
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
disassembler. It first builds an atom for each section. It can also
construct the CFG, and this splits the text atoms into basic blocks.
MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.
In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).
This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.
Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.
llvm-svn: 182628
2013-05-24 09:07:04 +08:00
|
|
|
#include "llvm/MC/MCSubtargetInfo.h"
|
2019-10-23 18:24:35 +08:00
|
|
|
#include "llvm/MC/MCTargetOptions.h"
|
2012-12-04 18:44:52 +08:00
|
|
|
#include "llvm/Object/Archive.h"
|
|
|
|
#include "llvm/Object/COFF.h"
|
2016-08-19 00:39:19 +08:00
|
|
|
#include "llvm/Object/COFFImportFile.h"
|
2016-01-27 00:44:37 +08:00
|
|
|
#include "llvm/Object/ELFObjectFile.h"
|
2021-01-28 02:21:47 +08:00
|
|
|
#include "llvm/Object/FaultMapParser.h"
|
Add a function to get the segment name of a section.
On MachO, sections also have segment names. When a tool looking at a .o file
prints a segment name, this is what they mean. In reality, a .o has only one
anonymous, segment.
This patch adds a MachO only function to fetch that segment name. I named it
getSectionFinalSegmentName since the main use for the name seems to be inform
the linker with segment this section should go to.
The patch also changes MachOObjectFile::getSectionName to return just the
section name instead of computing SegmentName,SectionName.
The main difference from the previous patch is that it doesn't use
InMemoryStruct. It is extremely dangerous: if the endians match it returns
a pointer to the file buffer, if not, it returns a pointer to an internal buffer
that is overwritten in the next API call.
We should change all of this code to use
support::detail::packed_endian_specific_integral like ELF, but since these
functions only handle strings, they work with big and little endian machines
as is.
I have tested this by installing ubuntu 12.10 ppc on qemu, that is why it took
so long :-)
llvm-svn: 170838
2012-12-21 11:47:03 +08:00
|
|
|
#include "llvm/Object/MachO.h"
|
2018-08-03 08:06:38 +08:00
|
|
|
#include "llvm/Object/MachOUniversal.h"
|
2012-12-04 18:44:52 +08:00
|
|
|
#include "llvm/Object/ObjectFile.h"
|
2017-06-28 04:40:53 +08:00
|
|
|
#include "llvm/Object/Wasm.h"
|
2011-10-08 08:18:30 +08:00
|
|
|
#include "llvm/Support/Casting.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/Support/CommandLine.h"
|
|
|
|
#include "llvm/Support/Debug.h"
|
2015-06-05 02:34:11 +08:00
|
|
|
#include "llvm/Support/Errc.h"
|
2011-10-08 08:18:30 +08:00
|
|
|
#include "llvm/Support/FileSystem.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/Support/Format.h"
|
2019-08-15 13:15:22 +08:00
|
|
|
#include "llvm/Support/FormatVariadic.h"
|
2011-07-26 07:04:36 +08:00
|
|
|
#include "llvm/Support/GraphWriter.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/Support/Host.h"
|
2018-04-14 02:26:06 +08:00
|
|
|
#include "llvm/Support/InitLLVM.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/Support/MemoryBuffer.h"
|
|
|
|
#include "llvm/Support/SourceMgr.h"
|
2018-08-24 23:21:57 +08:00
|
|
|
#include "llvm/Support/StringSaver.h"
|
2011-08-25 02:08:43 +08:00
|
|
|
#include "llvm/Support/TargetRegistry.h"
|
|
|
|
#include "llvm/Support/TargetSelect.h"
|
2018-11-12 06:12:04 +08:00
|
|
|
#include "llvm/Support/WithColor.h"
|
2011-01-20 14:39:06 +08:00
|
|
|
#include "llvm/Support/raw_ostream.h"
|
|
|
|
#include <algorithm>
|
2012-03-23 19:49:32 +08:00
|
|
|
#include <cctype>
|
2011-01-20 14:39:06 +08:00
|
|
|
#include <cstring>
|
2014-06-13 01:38:55 +08:00
|
|
|
#include <system_error>
|
2016-08-16 03:49:24 +08:00
|
|
|
#include <unordered_map>
|
2017-07-20 06:27:28 +08:00
|
|
|
#include <utility>
|
MC CFG: Add YAML MCModule representation to enable MC CFG testing.
Like yaml ObjectFiles, this will be very useful for testing the MC CFG
implementation (mostly MCObjectDisassembler), by matching the output
with YAML, and for potential users of the MC CFG, by using it as an input.
There isn't much to the actual format, it is just a serialization of the
MCModule class. Of note:
- Basic block references (pred/succ, ..) are represented by the BB's
start address.
- Just as in the MC CFG, instructions are MCInsts with a size.
- Operands have a prefix representing the type (only register and
immediate supported here).
- Instruction opcodes are represented by their names; enum values aren't
stable, enum names mostly are: usually, a change to a name would need
lots of changes in the backend anyway.
Same with registers.
All in all, an example is better than 1000 words, here goes:
A simple binary:
Disassembly of section __TEXT,__text:
_main:
100000f9c: 48 8b 46 08 movq 8(%rsi), %rax
100000fa0: 0f be 00 movsbl (%rax), %eax
100000fa3: 3b 04 25 48 00 00 00 cmpl 72, %eax
100000faa: 0f 8c 07 00 00 00 jl 7 <.Lend>
100000fb0: 2b 04 25 48 00 00 00 subl 72, %eax
.Lend:
100000fb7: c3 ret
And the (pretty verbose) generated YAML:
---
Atoms:
- StartAddress: 0x0000000100000F9C
Size: 20
Type: Text
Content:
- Inst: MOV64rm
Size: 4
Ops: [ RRAX, RRSI, I1, R, I8, R ]
- Inst: MOVSX32rm8
Size: 3
Ops: [ REAX, RRAX, I1, R, I0, R ]
- Inst: CMP32rm
Size: 7
Ops: [ REAX, R, I1, R, I72, R ]
- Inst: JL_4
Size: 6
Ops: [ I7 ]
- StartAddress: 0x0000000100000FB0
Size: 7
Type: Text
Content:
- Inst: SUB32rm
Size: 7
Ops: [ REAX, REAX, R, I1, R, I72, R ]
- StartAddress: 0x0000000100000FB7
Size: 1
Type: Text
Content:
- Inst: RET
Size: 1
Ops: [ ]
Functions:
- Name: __text
BasicBlocks:
- Address: 0x0000000100000F9C
Preds: [ ]
Succs: [ 0x0000000100000FB7, 0x0000000100000FB0 ]
<snip>
...
llvm-svn: 188890
2013-08-21 15:29:02 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
using namespace llvm;
|
2019-04-15 23:31:42 +08:00
|
|
|
using namespace llvm::object;
|
[llvm-objdump][COFF][NFC] Split format-specific interfaces; add namespace
Summary:
This patch addresses, for the interfaces implemented by `COFFDump.cpp`,
multiple issues identified with the current structure of
`llvm-objdump.h` in the review of D72973.
This patch moves implementation details of the tool into an
`llvm::objdump` namespace for external linkage names, splits the
implementation details into separate headers for each implementation
file, and uses qualified names when declaring members of the
`llvm::objdump` namespace in place of leaving the namespace definition
open.
Reviewers: jhenderson, DiggerLin, jasonliu, daltenty, MaskRay
Reviewed By: jhenderson, MaskRay
Subscribers: MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77285
2020-04-03 06:17:52 +08:00
|
|
|
using namespace llvm::objdump;
|
2011-01-20 14:39:06 +08:00
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
#define DEBUG_TYPE "objdump"
|
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
static cl::OptionCategory ObjdumpCat("llvm-objdump Options");
|
2019-04-15 23:00:10 +08:00
|
|
|
|
2019-04-24 10:40:20 +08:00
|
|
|
static cl::opt<uint64_t> AdjustVMA(
|
2019-01-28 18:44:01 +08:00
|
|
|
"adjust-vma",
|
|
|
|
cl::desc("Increase the displayed address by the specified offset"),
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::value_desc("offset"), cl::init(0), cl::cat(ObjdumpCat));
|
2019-01-28 18:44:01 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool>
|
|
|
|
AllHeaders("all-headers",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display all available header information"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2018-06-28 04:45:11 +08:00
|
|
|
static cl::alias AllHeadersShort("x", cl::desc("Alias for --all-headers"),
|
2019-02-20 03:46:08 +08:00
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(AllHeaders));
|
2018-06-28 04:45:11 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<std::string>
|
2019-05-22 14:30:46 +08:00
|
|
|
ArchName("arch-name",
|
|
|
|
cl::desc("Target arch to disassemble for, "
|
2020-11-30 19:32:46 +08:00
|
|
|
"see --version for available targets"),
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::cat(ObjdumpCat));
|
2015-07-29 23:45:39 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool>
|
|
|
|
objdump::ArchiveHeaders("archive-headers",
|
|
|
|
cl::desc("Display archive header information"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias ArchiveHeadersShort("a",
|
|
|
|
cl::desc("Alias for --archive-headers"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(ArchiveHeaders));
|
2018-07-19 00:39:21 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::Demangle("demangle", cl::desc("Demangle symbols names"),
|
|
|
|
cl::init(false), cl::cat(ObjdumpCat));
|
2018-07-19 00:39:21 +08:00
|
|
|
static cl::alias DemangleShort("C", cl::desc("Alias for --demangle"),
|
2019-02-20 03:46:08 +08:00
|
|
|
cl::NotHidden, cl::Grouping,
|
2019-04-15 23:00:10 +08:00
|
|
|
cl::aliasopt(Demangle));
|
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::Disassemble(
|
2019-04-15 23:00:10 +08:00
|
|
|
"disassemble",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display assembler mnemonics for the machine instructions"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias DisassembleShort("d", cl::desc("Alias for --disassemble"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(Disassemble));
|
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::DisassembleAll(
|
2019-04-15 23:00:10 +08:00
|
|
|
"disassemble-all",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display assembler mnemonics for the machine instructions"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias DisassembleAllShort("D",
|
|
|
|
cl::desc("Alias for --disassemble-all"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(DisassembleAll));
|
2018-07-19 00:39:21 +08:00
|
|
|
|
2020-04-22 05:52:08 +08:00
|
|
|
cl::opt<bool> objdump::SymbolDescription(
|
|
|
|
"symbol-description",
|
|
|
|
cl::desc("Add symbol description for disassembly. This "
|
|
|
|
"option is for XCOFF files only"),
|
|
|
|
cl::init(false), cl::cat(ObjdumpCat));
|
2020-04-06 22:09:12 +08:00
|
|
|
|
2018-03-10 03:13:44 +08:00
|
|
|
static cl::list<std::string>
|
2020-03-08 04:55:44 +08:00
|
|
|
DisassembleSymbols("disassemble-symbols", cl::CommaSeparated,
|
|
|
|
cl::desc("List of symbols to disassemble. "
|
|
|
|
"Accept demangled names when --demangle is "
|
|
|
|
"specified, otherwise accept mangled names"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2018-03-10 03:13:44 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool> DisassembleZeroes(
|
|
|
|
"disassemble-zeroes",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Do not skip blocks of zeroes when disassembling"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias
|
|
|
|
DisassembleZeroesShort("z", cl::desc("Alias for --disassemble-zeroes"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(DisassembleZeroes));
|
2011-10-08 08:18:30 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::list<std::string>
|
|
|
|
DisassemblerOptions("disassembler-options",
|
|
|
|
cl::desc("Pass target specific disassembler options"),
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::value_desc("options"), cl::CommaSeparated,
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias
|
|
|
|
DisassemblerOptionsShort("M", cl::desc("Alias for --disassembler-options"),
|
|
|
|
cl::NotHidden, cl::Grouping, cl::Prefix,
|
|
|
|
cl::CommaSeparated,
|
|
|
|
cl::aliasopt(DisassemblerOptions));
|
2018-06-07 21:30:55 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<DIDumpType> objdump::DwarfDumpType(
|
2019-04-15 23:00:10 +08:00
|
|
|
"dwarf", cl::init(DIDT_Null), cl::desc("Dump of dwarf debug sections:"),
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::values(clEnumValN(DIDT_DebugFrame, "frames", ".debug_frame")),
|
|
|
|
cl::cat(ObjdumpCat));
|
2011-10-18 01:13:22 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool> DynamicRelocations(
|
|
|
|
"dynamic-reloc",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display the dynamic relocation entries in the file"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias DynamicRelocationShort("R",
|
|
|
|
cl::desc("Alias for --dynamic-reloc"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(DynamicRelocations));
|
2011-10-19 03:32:17 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool>
|
|
|
|
FaultMapSection("fault-map-section",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display contents of faultmap section"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2014-09-13 05:34:15 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool>
|
|
|
|
FileHeaders("file-headers",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display the contents of the overall file header"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias FileHeadersShort("f", cl::desc("Alias for --file-headers"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(FileHeaders));
|
2014-09-16 09:41:51 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool>
|
|
|
|
objdump::SectionContents("full-contents",
|
|
|
|
cl::desc("Display the content of each section"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias SectionContentsShort("s",
|
|
|
|
cl::desc("Alias for --full-contents"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(SectionContents));
|
2014-09-16 09:41:51 +08:00
|
|
|
|
2019-05-22 14:30:46 +08:00
|
|
|
static cl::list<std::string> InputFilenames(cl::Positional,
|
|
|
|
cl::desc("<input object files>"),
|
|
|
|
cl::ZeroOrMore,
|
|
|
|
cl::cat(ObjdumpCat));
|
2014-09-16 09:41:51 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool>
|
|
|
|
PrintLines("line-numbers",
|
|
|
|
cl::desc("Display source line numbers with "
|
2019-05-22 14:30:46 +08:00
|
|
|
"disassembly. Implies disassemble object"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias PrintLinesShort("l", cl::desc("Alias for --line-numbers"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(PrintLines));
|
2015-07-08 10:04:15 +08:00
|
|
|
|
2019-05-22 14:30:46 +08:00
|
|
|
static cl::opt<bool> MachOOpt("macho",
|
|
|
|
cl::desc("Use MachO specific object file parser"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-01-18 18:41:26 +08:00
|
|
|
static cl::alias MachOm("m", cl::desc("Alias for --macho"), cl::NotHidden,
|
2019-02-20 03:46:08 +08:00
|
|
|
cl::Grouping, cl::aliasopt(MachOOpt));
|
2011-07-21 03:37:35 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<std::string> objdump::MCPU(
|
2020-11-30 18:39:07 +08:00
|
|
|
"mcpu", cl::desc("Target a specific cpu type (--mcpu=help for details)"),
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::value_desc("cpu-name"), cl::init(""), cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
|
2020-11-30 18:39:07 +08:00
|
|
|
cl::list<std::string> objdump::MAttrs(
|
|
|
|
"mattr", cl::CommaSeparated,
|
|
|
|
cl::desc("Target specific attributes (--mattr=help for details)"),
|
|
|
|
cl::value_desc("a1,+a2,-a3,..."), cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::NoShowRawInsn(
|
|
|
|
"no-show-raw-insn",
|
|
|
|
cl::desc(
|
|
|
|
"When disassembling instructions, do not print the instruction bytes."),
|
|
|
|
cl::cat(ObjdumpCat));
|
|
|
|
|
|
|
|
cl::opt<bool> objdump::NoLeadingAddr("no-leading-addr",
|
|
|
|
cl::desc("Print no leading address"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
|
|
|
|
static cl::opt<bool> RawClangAST(
|
|
|
|
"raw-clang-ast",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Dump the raw binary contents of the clang AST section"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2011-01-20 14:39:06 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
cl::opt<bool>
|
2020-04-07 04:56:13 +08:00
|
|
|
objdump::Relocations("reloc",
|
|
|
|
cl::desc("Display the relocation entries in the file"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias RelocationsShort("r", cl::desc("Alias for --reloc"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(Relocations));
|
2014-08-07 07:24:41 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool>
|
|
|
|
objdump::PrintImmHex("print-imm-hex",
|
|
|
|
cl::desc("Use hex format for immediate values"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2011-01-20 14:39:06 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool>
|
|
|
|
objdump::PrivateHeaders("private-headers",
|
|
|
|
cl::desc("Display format specific file headers"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias PrivateHeadersShort("p",
|
|
|
|
cl::desc("Alias for --private-headers"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(PrivateHeaders));
|
|
|
|
|
|
|
|
cl::list<std::string>
|
2020-04-07 04:56:13 +08:00
|
|
|
objdump::FilterSections("section",
|
|
|
|
cl::desc("Operate on the specified sections only. "
|
2020-11-30 19:32:46 +08:00
|
|
|
"With --macho dump segment,section"),
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias FilterSectionsj("j", cl::desc("Alias for --section"),
|
|
|
|
cl::NotHidden, cl::Grouping, cl::Prefix,
|
|
|
|
cl::aliasopt(FilterSections));
|
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::SectionHeaders(
|
|
|
|
"section-headers",
|
|
|
|
cl::desc("Display summaries of the headers for each section."),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-01-18 18:41:26 +08:00
|
|
|
static cl::alias SectionHeadersShort("headers",
|
|
|
|
cl::desc("Alias for --section-headers"),
|
|
|
|
cl::NotHidden,
|
|
|
|
cl::aliasopt(SectionHeaders));
|
|
|
|
static cl::alias SectionHeadersShorter("h",
|
|
|
|
cl::desc("Alias for --section-headers"),
|
2019-02-20 03:46:08 +08:00
|
|
|
cl::NotHidden, cl::Grouping,
|
2019-01-18 18:41:26 +08:00
|
|
|
cl::aliasopt(SectionHeaders));
|
2015-07-30 03:08:10 +08:00
|
|
|
|
2019-01-28 22:11:35 +08:00
|
|
|
static cl::opt<bool>
|
|
|
|
ShowLMA("show-lma",
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::desc("Display LMA column when dumping ELF section headers"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-01-28 22:11:35 +08:00
|
|
|
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::opt<bool> PrintSource(
|
2016-08-16 03:49:24 +08:00
|
|
|
"source",
|
|
|
|
cl::desc(
|
2019-05-22 14:30:46 +08:00
|
|
|
"Display source inlined with disassembly. Implies disassemble object"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2020-11-30 19:32:46 +08:00
|
|
|
static cl::alias PrintSourceShort("S", cl::desc("Alias for --source"),
|
2019-04-15 23:00:10 +08:00
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(PrintSource));
|
2016-08-16 03:49:24 +08:00
|
|
|
|
2019-04-24 10:40:20 +08:00
|
|
|
static cl::opt<uint64_t>
|
2016-09-13 01:08:22 +08:00
|
|
|
StartAddress("start-address", cl::desc("Disassemble beginning at address"),
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::value_desc("address"), cl::init(0), cl::cat(ObjdumpCat));
|
2019-04-24 10:40:20 +08:00
|
|
|
static cl::opt<uint64_t> StopAddress("stop-address",
|
|
|
|
cl::desc("Stop disassembly at address"),
|
|
|
|
cl::value_desc("address"),
|
2019-05-22 14:30:46 +08:00
|
|
|
cl::init(UINT64_MAX), cl::cat(ObjdumpCat));
|
2019-01-10 22:55:26 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::SymbolTable("syms", cl::desc("Display the symbol table"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias SymbolTableShort("t", cl::desc("Alias for --syms"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(SymbolTable));
|
|
|
|
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
static cl::opt<bool> SymbolizeOperands(
|
|
|
|
"symbolize-operands",
|
|
|
|
cl::desc("Symbolize instruction operands when disassembling"),
|
|
|
|
cl::cat(ObjdumpCat));
|
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
static cl::opt<bool> DynamicSymbolTable(
|
2020-04-05 09:58:53 +08:00
|
|
|
"dynamic-syms",
|
|
|
|
cl::desc("Display the contents of the dynamic symbol table"),
|
|
|
|
cl::cat(ObjdumpCat));
|
|
|
|
static cl::alias DynamicSymbolTableShort("T",
|
|
|
|
cl::desc("Alias for --dynamic-syms"),
|
|
|
|
cl::NotHidden, cl::Grouping,
|
|
|
|
cl::aliasopt(DynamicSymbolTable));
|
|
|
|
|
2020-11-30 19:32:46 +08:00
|
|
|
cl::opt<std::string>
|
|
|
|
objdump::TripleName("triple",
|
|
|
|
cl::desc("Target triple to disassemble for, see "
|
|
|
|
"--version for available targets"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
|
2020-04-07 04:56:13 +08:00
|
|
|
cl::opt<bool> objdump::UnwindInfo("unwind-info",
|
|
|
|
cl::desc("Display unwind information"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-15 23:00:10 +08:00
|
|
|
static cl::alias UnwindInfoShort("u", cl::desc("Alias for --unwind-info"),
|
2019-02-20 03:46:08 +08:00
|
|
|
cl::NotHidden, cl::Grouping,
|
2019-04-15 23:00:10 +08:00
|
|
|
cl::aliasopt(UnwindInfo));
|
2019-01-10 22:55:26 +08:00
|
|
|
|
2019-04-10 12:46:01 +08:00
|
|
|
static cl::opt<bool>
|
2019-05-22 14:30:46 +08:00
|
|
|
Wide("wide", cl::desc("Ignored for compatibility with GNU objdump"),
|
|
|
|
cl::cat(ObjdumpCat));
|
2019-04-10 12:46:01 +08:00
|
|
|
static cl::alias WideShort("w", cl::Grouping, cl::aliasopt(Wide));
|
|
|
|
|
2020-10-16 22:35:19 +08:00
|
|
|
cl::opt<std::string> objdump::Prefix("prefix",
|
|
|
|
cl::desc("Add prefix to absolute paths"),
|
|
|
|
cl::cat(ObjdumpCat));
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
enum DebugVarsFormat {
|
|
|
|
DVDisabled,
|
|
|
|
DVUnicode,
|
|
|
|
DVASCII,
|
|
|
|
};
|
|
|
|
|
|
|
|
static cl::opt<DebugVarsFormat> DbgVariables(
|
|
|
|
"debug-vars", cl::init(DVDisabled),
|
|
|
|
cl::desc("Print the locations (in registers or memory) of "
|
|
|
|
"source-level variables alongside disassembly"),
|
|
|
|
cl::ValueOptional,
|
|
|
|
cl::values(clEnumValN(DVUnicode, "", "unicode"),
|
|
|
|
clEnumValN(DVUnicode, "unicode", "unicode"),
|
|
|
|
clEnumValN(DVASCII, "ascii", "unicode")),
|
|
|
|
cl::cat(ObjdumpCat));
|
|
|
|
|
|
|
|
static cl::opt<int>
|
|
|
|
DbgIndent("debug-vars-indent", cl::init(40),
|
|
|
|
cl::desc("Distance to indent the source-level variable display, "
|
|
|
|
"relative to the start of the disassembly"),
|
|
|
|
cl::cat(ObjdumpCat));
|
|
|
|
|
2019-06-21 19:49:20 +08:00
|
|
|
static cl::extrahelp
|
|
|
|
HelpResponse("\nPass @FILE as argument to read options from FILE.\n");
|
|
|
|
|
2020-03-08 04:55:44 +08:00
|
|
|
static StringSet<> DisasmSymbolSet;
|
2020-04-10 03:32:09 +08:00
|
|
|
StringSet<> objdump::FoundSectionSet;
|
2011-09-20 01:56:04 +08:00
|
|
|
static StringRef ToolName;
|
2011-06-26 01:55:23 +08:00
|
|
|
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
namespace {
|
|
|
|
struct FilterResult {
|
|
|
|
// True if the section should not be skipped.
|
|
|
|
bool Keep;
|
|
|
|
|
|
|
|
// True if the index counter should be incremented, even if the section should
|
|
|
|
// be skipped. For example, sections may be skipped if they are not included
|
|
|
|
// in the --section flag, but we still want those to count toward the section
|
|
|
|
// count.
|
|
|
|
bool IncrementIndex;
|
|
|
|
};
|
|
|
|
} // namespace
|
|
|
|
|
|
|
|
static FilterResult checkSectionFilter(object::SectionRef S) {
|
2019-05-22 23:12:51 +08:00
|
|
|
if (FilterSections.empty())
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
return {/*Keep=*/true, /*IncrementIndex=*/true};
|
2019-08-14 19:10:11 +08:00
|
|
|
|
|
|
|
Expected<StringRef> SecNameOrErr = S.getName();
|
|
|
|
if (!SecNameOrErr) {
|
|
|
|
consumeError(SecNameOrErr.takeError());
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
return {/*Keep=*/false, /*IncrementIndex=*/false};
|
2019-08-14 19:10:11 +08:00
|
|
|
}
|
|
|
|
StringRef SecName = *SecNameOrErr;
|
|
|
|
|
2019-07-03 02:38:17 +08:00
|
|
|
// StringSet does not allow empty key so avoid adding sections with
|
|
|
|
// no name (such as the section with index 0) here.
|
|
|
|
if (!SecName.empty())
|
|
|
|
FoundSectionSet.insert(SecName);
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
|
|
|
|
// Only show the section if it's in the FilterSections list, but always
|
|
|
|
// increment so the indexing is stable.
|
|
|
|
return {/*Keep=*/is_contained(FilterSections, SecName),
|
|
|
|
/*IncrementIndex=*/true};
|
2019-05-22 23:12:51 +08:00
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
SectionFilter objdump::ToolSectionFilter(object::ObjectFile const &O,
|
|
|
|
uint64_t *Idx) {
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
// Start at UINT64_MAX so that the first index returned after an increment is
|
|
|
|
// zero (after the unsigned wrap).
|
|
|
|
if (Idx)
|
|
|
|
*Idx = UINT64_MAX;
|
|
|
|
return SectionFilter(
|
|
|
|
[Idx](object::SectionRef S) {
|
|
|
|
FilterResult Result = checkSectionFilter(S);
|
|
|
|
if (Idx != nullptr && Result.IncrementIndex)
|
|
|
|
*Idx += 1;
|
|
|
|
return Result.Keep;
|
|
|
|
},
|
|
|
|
O);
|
2015-07-29 23:45:39 +08:00
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
std::string objdump::getFileNameForError(const object::Archive::Child &C,
|
|
|
|
unsigned Index) {
|
2019-08-20 21:19:16 +08:00
|
|
|
Expected<StringRef> NameOrErr = C.getName();
|
|
|
|
if (NameOrErr)
|
2020-01-29 03:23:46 +08:00
|
|
|
return std::string(NameOrErr.get());
|
2019-08-20 21:19:16 +08:00
|
|
|
// If we have an error getting the name then we print the index of the archive
|
|
|
|
// member. Since we are already in an error state, we just ignore this error.
|
|
|
|
consumeError(NameOrErr.takeError());
|
|
|
|
return "<file index: " + std::to_string(Index) + ">";
|
|
|
|
}
|
|
|
|
|
2021-01-07 20:34:11 +08:00
|
|
|
void objdump::reportWarning(const Twine &Message, StringRef File) {
|
2019-07-10 05:53:33 +08:00
|
|
|
// Output order between errs() and outs() matters especially for archive
|
|
|
|
// files where the output is per member object.
|
|
|
|
outs().flush();
|
2019-08-21 19:07:31 +08:00
|
|
|
WithColor::warning(errs(), ToolName)
|
|
|
|
<< "'" << File << "': " << Message << "\n";
|
2019-06-08 04:34:31 +08:00
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
LLVM_ATTRIBUTE_NORETURN void objdump::reportError(StringRef File,
|
2021-01-07 20:34:11 +08:00
|
|
|
const Twine &Message) {
|
2020-05-31 08:25:18 +08:00
|
|
|
outs().flush();
|
2019-08-21 19:07:31 +08:00
|
|
|
WithColor::error(errs(), ToolName) << "'" << File << "': " << Message << "\n";
|
2016-11-17 06:17:38 +08:00
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
LLVM_ATTRIBUTE_NORETURN void objdump::reportError(Error E, StringRef FileName,
|
|
|
|
StringRef ArchiveName,
|
|
|
|
StringRef ArchitectureName) {
|
2016-05-18 01:10:12 +08:00
|
|
|
assert(E);
|
2020-05-31 08:25:18 +08:00
|
|
|
outs().flush();
|
2018-11-12 06:12:04 +08:00
|
|
|
WithColor::error(errs(), ToolName);
|
2016-05-18 01:10:12 +08:00
|
|
|
if (ArchiveName != "")
|
|
|
|
errs() << ArchiveName << "(" << FileName << ")";
|
|
|
|
else
|
2016-10-27 06:37:52 +08:00
|
|
|
errs() << "'" << FileName << "'";
|
2016-06-01 04:35:34 +08:00
|
|
|
if (!ArchitectureName.empty())
|
|
|
|
errs() << " (for architecture " << ArchitectureName << ")";
|
2020-05-31 08:25:18 +08:00
|
|
|
errs() << ": ";
|
|
|
|
logAllUnhandledErrors(std::move(E), errs());
|
2016-05-18 01:10:12 +08:00
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
2021-01-07 20:34:11 +08:00
|
|
|
static void reportCmdLineWarning(const Twine &Message) {
|
2019-08-21 19:07:31 +08:00
|
|
|
WithColor::warning(errs(), ToolName) << Message << "\n";
|
|
|
|
}
|
|
|
|
|
2021-01-07 20:34:11 +08:00
|
|
|
LLVM_ATTRIBUTE_NORETURN static void reportCmdLineError(const Twine &Message) {
|
2019-08-21 19:07:31 +08:00
|
|
|
WithColor::error(errs(), ToolName) << Message << "\n";
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
2019-07-03 02:38:17 +08:00
|
|
|
static void warnOnNoMatchForSections() {
|
|
|
|
SetVector<StringRef> MissingSections;
|
|
|
|
for (StringRef S : FilterSections) {
|
|
|
|
if (FoundSectionSet.count(S))
|
|
|
|
return;
|
|
|
|
// User may specify a unnamed section. Don't warn for it.
|
|
|
|
if (!S.empty())
|
|
|
|
MissingSections.insert(S);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Warn only if no section in FilterSections is matched.
|
|
|
|
for (StringRef S : MissingSections)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportCmdLineWarning("section '" + S +
|
|
|
|
"' mentioned in a -j/--section option, but not "
|
|
|
|
"found in any input file");
|
2019-07-03 02:38:17 +08:00
|
|
|
}
|
|
|
|
|
2019-08-21 19:07:31 +08:00
|
|
|
static const Target *getTarget(const ObjectFile *Obj) {
|
2011-01-20 14:39:06 +08:00
|
|
|
// Figure out the target triple.
|
2019-04-15 23:31:42 +08:00
|
|
|
Triple TheTriple("unknown-unknown-unknown");
|
2011-01-20 15:22:04 +08:00
|
|
|
if (TripleName.empty()) {
|
2019-08-21 19:07:31 +08:00
|
|
|
TheTriple = Obj->makeTriple();
|
2017-01-18 21:52:12 +08:00
|
|
|
} else {
|
2012-05-09 07:38:45 +08:00
|
|
|
TheTriple.setTriple(Triple::normalize(TripleName));
|
2019-08-21 19:07:31 +08:00
|
|
|
auto Arch = Obj->getArch();
|
|
|
|
if (Arch == Triple::arm || Arch == Triple::armeb)
|
|
|
|
Obj->setARMSubArch(TheTriple);
|
2017-01-18 21:52:12 +08:00
|
|
|
}
|
2011-01-20 14:39:06 +08:00
|
|
|
|
|
|
|
// Get the target specific parser.
|
|
|
|
std::string Error;
|
2012-05-09 07:38:45 +08:00
|
|
|
const Target *TheTarget = TargetRegistry::lookupTarget(ArchName, TheTriple,
|
|
|
|
Error);
|
2019-08-21 19:07:31 +08:00
|
|
|
if (!TheTarget)
|
|
|
|
reportError(Obj->getFileName(), "can't find target: " + Error);
|
2012-05-09 07:38:45 +08:00
|
|
|
|
|
|
|
// Update the triple name and return the found target.
|
|
|
|
TripleName = TheTriple.getTriple();
|
|
|
|
return TheTarget;
|
2011-01-20 14:39:06 +08:00
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
bool objdump::isRelocAddressLess(RelocationRef A, RelocationRef B) {
|
2019-01-15 17:19:18 +08:00
|
|
|
return A.getOffset() < B.getOffset();
|
2011-10-14 06:17:18 +08:00
|
|
|
}
|
|
|
|
|
2019-04-09 00:24:08 +08:00
|
|
|
static Error getRelocationValueString(const RelocationRef &Rel,
|
|
|
|
SmallVectorImpl<char> &Result) {
|
2018-05-15 03:46:08 +08:00
|
|
|
const ObjectFile *Obj = Rel.getObject();
|
|
|
|
if (auto *ELF = dyn_cast<ELFObjectFileBase>(Obj))
|
2019-01-18 19:33:26 +08:00
|
|
|
return getELFRelocationValueString(ELF, Rel, Result);
|
2018-05-15 03:46:08 +08:00
|
|
|
if (auto *COFF = dyn_cast<COFFObjectFile>(Obj))
|
2019-01-18 19:33:26 +08:00
|
|
|
return getCOFFRelocationValueString(COFF, Rel, Result);
|
2018-05-15 03:46:08 +08:00
|
|
|
if (auto *Wasm = dyn_cast<WasmObjectFile>(Obj))
|
2019-01-18 19:33:26 +08:00
|
|
|
return getWasmRelocationValueString(Wasm, Rel, Result);
|
2018-05-15 03:46:08 +08:00
|
|
|
if (auto *MachO = dyn_cast<MachOObjectFile>(Obj))
|
2019-01-18 19:33:26 +08:00
|
|
|
return getMachORelocationValueString(MachO, Rel, Result);
|
2020-03-28 00:02:27 +08:00
|
|
|
if (auto *XCOFF = dyn_cast<XCOFFObjectFile>(Obj))
|
|
|
|
return getXCOFFRelocationValueString(XCOFF, Rel, Result);
|
2018-05-15 03:46:08 +08:00
|
|
|
llvm_unreachable("unknown object file format");
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Indicates whether this relocation should hidden when listing
|
|
|
|
/// relocations, usually because it is the trailing part of a multipart
|
|
|
|
/// relocation that will be printed as part of the leading relocation.
|
|
|
|
static bool getHidden(RelocationRef RelRef) {
|
2019-01-15 17:19:18 +08:00
|
|
|
auto *MachO = dyn_cast<MachOObjectFile>(RelRef.getObject());
|
2018-05-15 03:46:08 +08:00
|
|
|
if (!MachO)
|
|
|
|
return false;
|
|
|
|
|
|
|
|
unsigned Arch = MachO->getArch();
|
|
|
|
DataRefImpl Rel = RelRef.getRawDataRefImpl();
|
|
|
|
uint64_t Type = MachO->getRelocationType(Rel);
|
|
|
|
|
|
|
|
// On arches that use the generic relocations, GENERIC_RELOC_PAIR
|
|
|
|
// is always hidden.
|
2019-01-15 17:19:18 +08:00
|
|
|
if (Arch == Triple::x86 || Arch == Triple::arm || Arch == Triple::ppc)
|
|
|
|
return Type == MachO::GENERIC_RELOC_PAIR;
|
|
|
|
|
|
|
|
if (Arch == Triple::x86_64) {
|
2018-05-15 03:46:08 +08:00
|
|
|
// On x86_64, X86_64_RELOC_UNSIGNED is hidden only when it follows
|
|
|
|
// an X86_64_RELOC_SUBTRACTOR.
|
|
|
|
if (Type == MachO::X86_64_RELOC_UNSIGNED && Rel.d.a > 0) {
|
|
|
|
DataRefImpl RelPrev = Rel;
|
|
|
|
RelPrev.d.a--;
|
|
|
|
uint64_t PrevType = MachO->getRelocationType(RelPrev);
|
|
|
|
if (PrevType == MachO::X86_64_RELOC_SUBTRACTOR)
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
namespace {
|
2020-03-17 22:21:42 +08:00
|
|
|
|
|
|
|
/// Get the column at which we want to start printing the instruction
|
|
|
|
/// disassembly, taking into account anything which appears to the left of it.
|
|
|
|
unsigned getInstStartColumn(const MCSubtargetInfo &STI) {
|
|
|
|
return NoShowRawInsn ? 16 : STI.getTargetTriple().isX86() ? 40 : 24;
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Stores a single expression representing the location of a source-level
|
|
|
|
/// variable, along with the PC range for which that expression is valid.
|
|
|
|
struct LiveVariable {
|
|
|
|
DWARFLocationExpression LocExpr;
|
|
|
|
const char *VarName;
|
|
|
|
DWARFUnit *Unit;
|
|
|
|
const DWARFDie FuncDie;
|
|
|
|
|
|
|
|
LiveVariable(const DWARFLocationExpression &LocExpr, const char *VarName,
|
|
|
|
DWARFUnit *Unit, const DWARFDie FuncDie)
|
|
|
|
: LocExpr(LocExpr), VarName(VarName), Unit(Unit), FuncDie(FuncDie) {}
|
|
|
|
|
|
|
|
bool liveAtAddress(object::SectionedAddress Addr) {
|
|
|
|
if (LocExpr.Range == None)
|
|
|
|
return false;
|
|
|
|
return LocExpr.Range->SectionIndex == Addr.SectionIndex &&
|
|
|
|
LocExpr.Range->LowPC <= Addr.Address &&
|
|
|
|
LocExpr.Range->HighPC > Addr.Address;
|
|
|
|
}
|
|
|
|
|
|
|
|
void print(raw_ostream &OS, const MCRegisterInfo &MRI) const {
|
|
|
|
DataExtractor Data({LocExpr.Expr.data(), LocExpr.Expr.size()},
|
|
|
|
Unit->getContext().isLittleEndian(), 0);
|
|
|
|
DWARFExpression Expression(Data, Unit->getAddressByteSize());
|
|
|
|
Expression.printCompact(OS, MRI);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
/// Helper class for printing source variable locations alongside disassembly.
|
|
|
|
class LiveVariablePrinter {
|
|
|
|
// Information we want to track about one column in which we are printing a
|
|
|
|
// variable live range.
|
|
|
|
struct Column {
|
|
|
|
unsigned VarIdx = NullVarIdx;
|
|
|
|
bool LiveIn = false;
|
|
|
|
bool LiveOut = false;
|
|
|
|
bool MustDrawLabel = false;
|
|
|
|
|
|
|
|
bool isActive() const { return VarIdx != NullVarIdx; }
|
|
|
|
|
|
|
|
static constexpr unsigned NullVarIdx = std::numeric_limits<unsigned>::max();
|
|
|
|
};
|
|
|
|
|
|
|
|
// All live variables we know about in the object/image file.
|
|
|
|
std::vector<LiveVariable> LiveVariables;
|
|
|
|
|
|
|
|
// The columns we are currently drawing.
|
|
|
|
IndexedMap<Column> ActiveCols;
|
|
|
|
|
|
|
|
const MCRegisterInfo &MRI;
|
|
|
|
const MCSubtargetInfo &STI;
|
|
|
|
|
|
|
|
void addVariable(DWARFDie FuncDie, DWARFDie VarDie) {
|
|
|
|
uint64_t FuncLowPC, FuncHighPC, SectionIndex;
|
|
|
|
FuncDie.getLowAndHighPC(FuncLowPC, FuncHighPC, SectionIndex);
|
|
|
|
const char *VarName = VarDie.getName(DINameKind::ShortName);
|
|
|
|
DWARFUnit *U = VarDie.getDwarfUnit();
|
|
|
|
|
|
|
|
Expected<DWARFLocationExpressionsVector> Locs =
|
|
|
|
VarDie.getLocations(dwarf::DW_AT_location);
|
|
|
|
if (!Locs) {
|
|
|
|
// If the variable doesn't have any locations, just ignore it. We don't
|
|
|
|
// report an error or warning here as that could be noisy on optimised
|
|
|
|
// code.
|
|
|
|
consumeError(Locs.takeError());
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (const DWARFLocationExpression &LocExpr : *Locs) {
|
|
|
|
if (LocExpr.Range) {
|
|
|
|
LiveVariables.emplace_back(LocExpr, VarName, U, FuncDie);
|
|
|
|
} else {
|
|
|
|
// If the LocExpr does not have an associated range, it is valid for
|
|
|
|
// the whole of the function.
|
|
|
|
// TODO: technically it is not valid for any range covered by another
|
|
|
|
// LocExpr, does that happen in reality?
|
|
|
|
DWARFLocationExpression WholeFuncExpr{
|
|
|
|
DWARFAddressRange(FuncLowPC, FuncHighPC, SectionIndex),
|
|
|
|
LocExpr.Expr};
|
|
|
|
LiveVariables.emplace_back(WholeFuncExpr, VarName, U, FuncDie);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void addFunction(DWARFDie D) {
|
|
|
|
for (const DWARFDie &Child : D.children()) {
|
|
|
|
if (Child.getTag() == dwarf::DW_TAG_variable ||
|
|
|
|
Child.getTag() == dwarf::DW_TAG_formal_parameter)
|
|
|
|
addVariable(D, Child);
|
|
|
|
else
|
|
|
|
addFunction(Child);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// Get the column number (in characters) at which the first live variable
|
|
|
|
// line should be printed.
|
|
|
|
unsigned getIndentLevel() const {
|
|
|
|
return DbgIndent + getInstStartColumn(STI);
|
|
|
|
}
|
|
|
|
|
|
|
|
// Indent to the first live-range column to the right of the currently
|
|
|
|
// printed line, and return the index of that column.
|
|
|
|
// TODO: formatted_raw_ostream uses "column" to mean a number of characters
|
|
|
|
// since the last \n, and we use it to mean the number of slots in which we
|
|
|
|
// put live variable lines. Pick a less overloaded word.
|
|
|
|
unsigned moveToFirstVarColumn(formatted_raw_ostream &OS) {
|
|
|
|
// Logical column number: column zero is the first column we print in, each
|
|
|
|
// logical column is 2 physical columns wide.
|
|
|
|
unsigned FirstUnprintedLogicalColumn =
|
|
|
|
std::max((int)(OS.getColumn() - getIndentLevel() + 1) / 2, 0);
|
|
|
|
// Physical column number: the actual column number in characters, with
|
|
|
|
// zero being the left-most side of the screen.
|
|
|
|
unsigned FirstUnprintedPhysicalColumn =
|
|
|
|
getIndentLevel() + FirstUnprintedLogicalColumn * 2;
|
|
|
|
|
|
|
|
if (FirstUnprintedPhysicalColumn > OS.getColumn())
|
|
|
|
OS.PadToColumn(FirstUnprintedPhysicalColumn);
|
|
|
|
|
|
|
|
return FirstUnprintedLogicalColumn;
|
|
|
|
}
|
|
|
|
|
|
|
|
unsigned findFreeColumn() {
|
|
|
|
for (unsigned ColIdx = 0; ColIdx < ActiveCols.size(); ++ColIdx)
|
|
|
|
if (!ActiveCols[ColIdx].isActive())
|
|
|
|
return ColIdx;
|
|
|
|
|
|
|
|
size_t OldSize = ActiveCols.size();
|
|
|
|
ActiveCols.grow(std::max<size_t>(OldSize * 2, 1));
|
|
|
|
return OldSize;
|
|
|
|
}
|
|
|
|
|
|
|
|
public:
|
|
|
|
LiveVariablePrinter(const MCRegisterInfo &MRI, const MCSubtargetInfo &STI)
|
|
|
|
: LiveVariables(), ActiveCols(Column()), MRI(MRI), STI(STI) {}
|
|
|
|
|
|
|
|
void dump() const {
|
|
|
|
for (const LiveVariable &LV : LiveVariables) {
|
|
|
|
dbgs() << LV.VarName << " @ " << LV.LocExpr.Range << ": ";
|
|
|
|
LV.print(dbgs(), MRI);
|
|
|
|
dbgs() << "\n";
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void addCompileUnit(DWARFDie D) {
|
|
|
|
if (D.getTag() == dwarf::DW_TAG_subprogram)
|
|
|
|
addFunction(D);
|
|
|
|
else
|
|
|
|
for (const DWARFDie &Child : D.children())
|
|
|
|
addFunction(Child);
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Update to match the state of the instruction between ThisAddr and
|
|
|
|
/// NextAddr. In the common case, any live range active at ThisAddr is
|
|
|
|
/// live-in to the instruction, and any live range active at NextAddr is
|
|
|
|
/// live-out of the instruction. If IncludeDefinedVars is false, then live
|
|
|
|
/// ranges starting at NextAddr will be ignored.
|
|
|
|
void update(object::SectionedAddress ThisAddr,
|
|
|
|
object::SectionedAddress NextAddr, bool IncludeDefinedVars) {
|
|
|
|
// First, check variables which have already been assigned a column, so
|
|
|
|
// that we don't change their order.
|
|
|
|
SmallSet<unsigned, 8> CheckedVarIdxs;
|
|
|
|
for (unsigned ColIdx = 0, End = ActiveCols.size(); ColIdx < End; ++ColIdx) {
|
|
|
|
if (!ActiveCols[ColIdx].isActive())
|
|
|
|
continue;
|
|
|
|
CheckedVarIdxs.insert(ActiveCols[ColIdx].VarIdx);
|
|
|
|
LiveVariable &LV = LiveVariables[ActiveCols[ColIdx].VarIdx];
|
|
|
|
ActiveCols[ColIdx].LiveIn = LV.liveAtAddress(ThisAddr);
|
|
|
|
ActiveCols[ColIdx].LiveOut = LV.liveAtAddress(NextAddr);
|
|
|
|
LLVM_DEBUG(dbgs() << "pass 1, " << ThisAddr.Address << "-"
|
|
|
|
<< NextAddr.Address << ", " << LV.VarName << ", Col "
|
|
|
|
<< ColIdx << ": LiveIn=" << ActiveCols[ColIdx].LiveIn
|
|
|
|
<< ", LiveOut=" << ActiveCols[ColIdx].LiveOut << "\n");
|
|
|
|
|
|
|
|
if (!ActiveCols[ColIdx].LiveIn && !ActiveCols[ColIdx].LiveOut)
|
|
|
|
ActiveCols[ColIdx].VarIdx = Column::NullVarIdx;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Next, look for variables which don't already have a column, but which
|
|
|
|
// are now live.
|
|
|
|
if (IncludeDefinedVars) {
|
|
|
|
for (unsigned VarIdx = 0, End = LiveVariables.size(); VarIdx < End;
|
|
|
|
++VarIdx) {
|
|
|
|
if (CheckedVarIdxs.count(VarIdx))
|
|
|
|
continue;
|
|
|
|
LiveVariable &LV = LiveVariables[VarIdx];
|
|
|
|
bool LiveIn = LV.liveAtAddress(ThisAddr);
|
|
|
|
bool LiveOut = LV.liveAtAddress(NextAddr);
|
|
|
|
if (!LiveIn && !LiveOut)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
unsigned ColIdx = findFreeColumn();
|
|
|
|
LLVM_DEBUG(dbgs() << "pass 2, " << ThisAddr.Address << "-"
|
|
|
|
<< NextAddr.Address << ", " << LV.VarName << ", Col "
|
|
|
|
<< ColIdx << ": LiveIn=" << LiveIn
|
|
|
|
<< ", LiveOut=" << LiveOut << "\n");
|
|
|
|
ActiveCols[ColIdx].VarIdx = VarIdx;
|
|
|
|
ActiveCols[ColIdx].LiveIn = LiveIn;
|
|
|
|
ActiveCols[ColIdx].LiveOut = LiveOut;
|
|
|
|
ActiveCols[ColIdx].MustDrawLabel = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
enum class LineChar {
|
|
|
|
RangeStart,
|
|
|
|
RangeMid,
|
|
|
|
RangeEnd,
|
|
|
|
LabelVert,
|
|
|
|
LabelCornerNew,
|
|
|
|
LabelCornerActive,
|
|
|
|
LabelHoriz,
|
|
|
|
};
|
|
|
|
const char *getLineChar(LineChar C) const {
|
|
|
|
bool IsASCII = DbgVariables == DVASCII;
|
|
|
|
switch (C) {
|
|
|
|
case LineChar::RangeStart:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "^" : (const char *)u8"\u2548";
|
2020-03-17 22:21:42 +08:00
|
|
|
case LineChar::RangeMid:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "|" : (const char *)u8"\u2503";
|
2020-03-17 22:21:42 +08:00
|
|
|
case LineChar::RangeEnd:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "v" : (const char *)u8"\u253b";
|
2020-03-17 22:21:42 +08:00
|
|
|
case LineChar::LabelVert:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "|" : (const char *)u8"\u2502";
|
2020-03-17 22:21:42 +08:00
|
|
|
case LineChar::LabelCornerNew:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "/" : (const char *)u8"\u250c";
|
2020-03-17 22:21:42 +08:00
|
|
|
case LineChar::LabelCornerActive:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "|" : (const char *)u8"\u2520";
|
2020-03-17 22:21:42 +08:00
|
|
|
case LineChar::LabelHoriz:
|
2020-12-17 18:41:35 +08:00
|
|
|
return IsASCII ? "-" : (const char *)u8"\u2500";
|
2020-03-17 22:21:42 +08:00
|
|
|
}
|
2020-07-09 22:00:57 +08:00
|
|
|
llvm_unreachable("Unhandled LineChar enum");
|
2020-03-17 22:21:42 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/// Print live ranges to the right of an existing line. This assumes the
|
|
|
|
/// line is not an instruction, so doesn't start or end any live ranges, so
|
|
|
|
/// we only need to print active ranges or empty columns. If AfterInst is
|
|
|
|
/// true, this is being printed after the last instruction fed to update(),
|
|
|
|
/// otherwise this is being printed before it.
|
|
|
|
void printAfterOtherLine(formatted_raw_ostream &OS, bool AfterInst) {
|
|
|
|
if (ActiveCols.size()) {
|
|
|
|
unsigned FirstUnprintedColumn = moveToFirstVarColumn(OS);
|
|
|
|
for (size_t ColIdx = FirstUnprintedColumn, End = ActiveCols.size();
|
|
|
|
ColIdx < End; ++ColIdx) {
|
|
|
|
if (ActiveCols[ColIdx].isActive()) {
|
|
|
|
if ((AfterInst && ActiveCols[ColIdx].LiveOut) ||
|
|
|
|
(!AfterInst && ActiveCols[ColIdx].LiveIn))
|
|
|
|
OS << getLineChar(LineChar::RangeMid);
|
|
|
|
else if (!AfterInst && ActiveCols[ColIdx].LiveOut)
|
|
|
|
OS << getLineChar(LineChar::LabelVert);
|
|
|
|
else
|
|
|
|
OS << " ";
|
|
|
|
}
|
|
|
|
OS << " ";
|
|
|
|
}
|
|
|
|
}
|
|
|
|
OS << "\n";
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Print any live variable range info needed to the right of a
|
|
|
|
/// non-instruction line of disassembly. This is where we print the variable
|
|
|
|
/// names and expressions, with thin line-drawing characters connecting them
|
|
|
|
/// to the live range which starts at the next instruction. If MustPrint is
|
|
|
|
/// true, we have to print at least one line (with the continuation of any
|
|
|
|
/// already-active live ranges) because something has already been printed
|
|
|
|
/// earlier on this line.
|
|
|
|
void printBetweenInsts(formatted_raw_ostream &OS, bool MustPrint) {
|
|
|
|
bool PrintedSomething = false;
|
|
|
|
for (unsigned ColIdx = 0, End = ActiveCols.size(); ColIdx < End; ++ColIdx) {
|
|
|
|
if (ActiveCols[ColIdx].isActive() && ActiveCols[ColIdx].MustDrawLabel) {
|
|
|
|
// First we need to print the live range markers for any active
|
|
|
|
// columns to the left of this one.
|
|
|
|
OS.PadToColumn(getIndentLevel());
|
|
|
|
for (unsigned ColIdx2 = 0; ColIdx2 < ColIdx; ++ColIdx2) {
|
|
|
|
if (ActiveCols[ColIdx2].isActive()) {
|
|
|
|
if (ActiveCols[ColIdx2].MustDrawLabel &&
|
|
|
|
!ActiveCols[ColIdx2].LiveIn)
|
|
|
|
OS << getLineChar(LineChar::LabelVert) << " ";
|
|
|
|
else
|
|
|
|
OS << getLineChar(LineChar::RangeMid) << " ";
|
|
|
|
} else
|
|
|
|
OS << " ";
|
|
|
|
}
|
|
|
|
|
|
|
|
// Then print the variable name and location of the new live range,
|
|
|
|
// with box drawing characters joining it to the live range line.
|
|
|
|
OS << getLineChar(ActiveCols[ColIdx].LiveIn
|
|
|
|
? LineChar::LabelCornerActive
|
|
|
|
: LineChar::LabelCornerNew)
|
|
|
|
<< getLineChar(LineChar::LabelHoriz) << " ";
|
|
|
|
WithColor(OS, raw_ostream::GREEN)
|
|
|
|
<< LiveVariables[ActiveCols[ColIdx].VarIdx].VarName;
|
|
|
|
OS << " = ";
|
|
|
|
{
|
|
|
|
WithColor ExprColor(OS, raw_ostream::CYAN);
|
|
|
|
LiveVariables[ActiveCols[ColIdx].VarIdx].print(OS, MRI);
|
|
|
|
}
|
|
|
|
|
|
|
|
// If there are any columns to the right of the expression we just
|
|
|
|
// printed, then continue their live range lines.
|
|
|
|
unsigned FirstUnprintedColumn = moveToFirstVarColumn(OS);
|
|
|
|
for (unsigned ColIdx2 = FirstUnprintedColumn, End = ActiveCols.size();
|
|
|
|
ColIdx2 < End; ++ColIdx2) {
|
|
|
|
if (ActiveCols[ColIdx2].isActive() && ActiveCols[ColIdx2].LiveIn)
|
|
|
|
OS << getLineChar(LineChar::RangeMid) << " ";
|
|
|
|
else
|
|
|
|
OS << " ";
|
|
|
|
}
|
|
|
|
|
|
|
|
OS << "\n";
|
|
|
|
PrintedSomething = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
for (unsigned ColIdx = 0, End = ActiveCols.size(); ColIdx < End; ++ColIdx)
|
|
|
|
if (ActiveCols[ColIdx].isActive())
|
|
|
|
ActiveCols[ColIdx].MustDrawLabel = false;
|
|
|
|
|
|
|
|
// If we must print something (because we printed a line/column number),
|
|
|
|
// but don't have any new variables to print, then print a line which
|
|
|
|
// just continues any existing live ranges.
|
|
|
|
if (MustPrint && !PrintedSomething)
|
|
|
|
printAfterOtherLine(OS, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
/// Print the live variable ranges to the right of a disassembled instruction.
|
|
|
|
void printAfterInst(formatted_raw_ostream &OS) {
|
|
|
|
if (!ActiveCols.size())
|
|
|
|
return;
|
|
|
|
unsigned FirstUnprintedColumn = moveToFirstVarColumn(OS);
|
|
|
|
for (unsigned ColIdx = FirstUnprintedColumn, End = ActiveCols.size();
|
|
|
|
ColIdx < End; ++ColIdx) {
|
|
|
|
if (!ActiveCols[ColIdx].isActive())
|
|
|
|
OS << " ";
|
|
|
|
else if (ActiveCols[ColIdx].LiveIn && ActiveCols[ColIdx].LiveOut)
|
|
|
|
OS << getLineChar(LineChar::RangeMid) << " ";
|
|
|
|
else if (ActiveCols[ColIdx].LiveOut)
|
|
|
|
OS << getLineChar(LineChar::RangeStart) << " ";
|
|
|
|
else if (ActiveCols[ColIdx].LiveIn)
|
|
|
|
OS << getLineChar(LineChar::RangeEnd) << " ";
|
|
|
|
else
|
|
|
|
llvm_unreachable("var must be live in or out!");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
class SourcePrinter {
|
|
|
|
protected:
|
|
|
|
DILineInfo OldLineInfo;
|
|
|
|
const ObjectFile *Obj = nullptr;
|
|
|
|
std::unique_ptr<symbolize::LLVMSymbolizer> Symbolizer;
|
2019-08-15 13:15:22 +08:00
|
|
|
// File name to file contents of source.
|
2018-05-15 03:46:08 +08:00
|
|
|
std::unordered_map<std::string, std::unique_ptr<MemoryBuffer>> SourceCache;
|
2019-08-15 13:15:22 +08:00
|
|
|
// Mark the line endings of the cached source.
|
2018-05-15 03:46:08 +08:00
|
|
|
std::unordered_map<std::string, std::vector<StringRef>> LineCache;
|
2019-08-15 13:15:22 +08:00
|
|
|
// Keep track of missing sources.
|
|
|
|
StringSet<> MissingSources;
|
2021-02-05 01:07:44 +08:00
|
|
|
// Only emit 'invalid debug info' warning once.
|
|
|
|
bool WarnedInvalidDebugInfo = false;
|
2018-05-15 03:46:08 +08:00
|
|
|
|
|
|
|
private:
|
|
|
|
bool cacheSource(const DILineInfo& LineInfoFile);
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
void printLines(formatted_raw_ostream &OS, const DILineInfo &LineInfo,
|
|
|
|
StringRef Delimiter, LiveVariablePrinter &LVP);
|
2020-02-22 07:30:51 +08:00
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
void printSources(formatted_raw_ostream &OS, const DILineInfo &LineInfo,
|
|
|
|
StringRef ObjectFilename, StringRef Delimiter,
|
|
|
|
LiveVariablePrinter &LVP);
|
2020-02-22 07:30:51 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
public:
|
|
|
|
SourcePrinter() = default;
|
2021-02-05 01:07:44 +08:00
|
|
|
SourcePrinter(const ObjectFile *Obj, StringRef DefaultArch) : Obj(Obj) {
|
2019-06-11 10:31:54 +08:00
|
|
|
symbolize::LLVMSymbolizer::Options SymbolizerOpts;
|
2020-02-22 07:30:51 +08:00
|
|
|
SymbolizerOpts.PrintFunctions =
|
|
|
|
DILineInfoSpecifier::FunctionNameKind::LinkageName;
|
|
|
|
SymbolizerOpts.Demangle = Demangle;
|
2020-01-29 03:23:46 +08:00
|
|
|
SymbolizerOpts.DefaultArch = std::string(DefaultArch);
|
2018-05-15 03:46:08 +08:00
|
|
|
Symbolizer.reset(new symbolize::LLVMSymbolizer(SymbolizerOpts));
|
|
|
|
}
|
|
|
|
virtual ~SourcePrinter() = default;
|
2020-03-17 22:21:42 +08:00
|
|
|
virtual void printSourceLine(formatted_raw_ostream &OS,
|
2019-02-27 21:17:36 +08:00
|
|
|
object::SectionedAddress Address,
|
2019-08-15 13:15:22 +08:00
|
|
|
StringRef ObjectFilename,
|
2020-03-17 22:21:42 +08:00
|
|
|
LiveVariablePrinter &LVP,
|
2018-05-15 03:46:08 +08:00
|
|
|
StringRef Delimiter = "; ");
|
|
|
|
};
|
|
|
|
|
|
|
|
bool SourcePrinter::cacheSource(const DILineInfo &LineInfo) {
|
|
|
|
std::unique_ptr<MemoryBuffer> Buffer;
|
|
|
|
if (LineInfo.Source) {
|
|
|
|
Buffer = MemoryBuffer::getMemBuffer(*LineInfo.Source);
|
|
|
|
} else {
|
|
|
|
auto BufferOrError = MemoryBuffer::getFile(LineInfo.FileName);
|
2019-08-15 13:15:22 +08:00
|
|
|
if (!BufferOrError) {
|
|
|
|
if (MissingSources.insert(LineInfo.FileName).second)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportWarning("failed to find source " + LineInfo.FileName,
|
|
|
|
Obj->getFileName());
|
2018-05-15 03:46:08 +08:00
|
|
|
return false;
|
2019-08-15 13:15:22 +08:00
|
|
|
}
|
2018-05-15 03:46:08 +08:00
|
|
|
Buffer = std::move(*BufferOrError);
|
|
|
|
}
|
|
|
|
// Chomp the file to get lines
|
2019-04-07 18:16:46 +08:00
|
|
|
const char *BufferStart = Buffer->getBufferStart(),
|
|
|
|
*BufferEnd = Buffer->getBufferEnd();
|
|
|
|
std::vector<StringRef> &Lines = LineCache[LineInfo.FileName];
|
|
|
|
const char *Start = BufferStart;
|
|
|
|
for (const char *I = BufferStart; I != BufferEnd; ++I)
|
|
|
|
if (*I == '\n') {
|
|
|
|
Lines.emplace_back(Start, I - Start - (BufferStart < I && I[-1] == '\r'));
|
|
|
|
Start = I + 1;
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
2019-04-07 18:16:46 +08:00
|
|
|
if (Start < BufferEnd)
|
|
|
|
Lines.emplace_back(Start, BufferEnd - Start);
|
2018-05-15 03:46:08 +08:00
|
|
|
SourceCache[LineInfo.FileName] = std::move(Buffer);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
void SourcePrinter::printSourceLine(formatted_raw_ostream &OS,
|
2019-02-27 21:17:36 +08:00
|
|
|
object::SectionedAddress Address,
|
2019-08-15 13:15:22 +08:00
|
|
|
StringRef ObjectFilename,
|
2020-03-17 22:21:42 +08:00
|
|
|
LiveVariablePrinter &LVP,
|
2018-05-15 03:46:08 +08:00
|
|
|
StringRef Delimiter) {
|
|
|
|
if (!Symbolizer)
|
|
|
|
return;
|
2019-06-19 13:40:24 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
DILineInfo LineInfo = DILineInfo();
|
2021-02-05 01:07:44 +08:00
|
|
|
Expected<DILineInfo> ExpectedLineInfo =
|
|
|
|
Symbolizer->symbolizeCode(*Obj, Address);
|
2019-08-15 13:15:22 +08:00
|
|
|
std::string ErrorMessage;
|
2021-02-05 01:07:44 +08:00
|
|
|
if (ExpectedLineInfo) {
|
2019-03-28 09:12:13 +08:00
|
|
|
LineInfo = *ExpectedLineInfo;
|
2021-02-05 01:07:44 +08:00
|
|
|
} else if (!WarnedInvalidDebugInfo) {
|
|
|
|
WarnedInvalidDebugInfo = true;
|
|
|
|
// TODO Untested.
|
|
|
|
reportWarning("failed to parse debug information: " +
|
|
|
|
toString(ExpectedLineInfo.takeError()),
|
|
|
|
ObjectFilename);
|
2019-08-15 13:15:22 +08:00
|
|
|
}
|
|
|
|
|
2020-10-16 22:35:19 +08:00
|
|
|
if (!Prefix.empty() && sys::path::is_absolute_gnu(LineInfo.FileName)) {
|
|
|
|
SmallString<128> FilePath;
|
|
|
|
sys::path::append(FilePath, Prefix, LineInfo.FileName);
|
|
|
|
|
|
|
|
LineInfo.FileName = std::string(FilePath);
|
|
|
|
}
|
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
if (PrintLines)
|
2020-03-17 22:21:42 +08:00
|
|
|
printLines(OS, LineInfo, Delimiter, LVP);
|
2020-02-22 07:30:51 +08:00
|
|
|
if (PrintSource)
|
2020-03-17 22:21:42 +08:00
|
|
|
printSources(OS, LineInfo, ObjectFilename, Delimiter, LVP);
|
2020-02-22 07:30:51 +08:00
|
|
|
OldLineInfo = LineInfo;
|
|
|
|
}
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
void SourcePrinter::printLines(formatted_raw_ostream &OS,
|
|
|
|
const DILineInfo &LineInfo, StringRef Delimiter,
|
|
|
|
LiveVariablePrinter &LVP) {
|
2020-02-22 07:30:51 +08:00
|
|
|
bool PrintFunctionName = LineInfo.FunctionName != DILineInfo::BadString &&
|
|
|
|
LineInfo.FunctionName != OldLineInfo.FunctionName;
|
|
|
|
if (PrintFunctionName) {
|
|
|
|
OS << Delimiter << LineInfo.FunctionName;
|
|
|
|
// If demangling is successful, FunctionName will end with "()". Print it
|
|
|
|
// only if demangling did not run or was unsuccessful.
|
|
|
|
if (!StringRef(LineInfo.FunctionName).endswith("()"))
|
|
|
|
OS << "()";
|
|
|
|
OS << ":\n";
|
|
|
|
}
|
|
|
|
if (LineInfo.FileName != DILineInfo::BadString && LineInfo.Line != 0 &&
|
|
|
|
(OldLineInfo.Line != LineInfo.Line ||
|
2020-03-17 22:21:42 +08:00
|
|
|
OldLineInfo.FileName != LineInfo.FileName || PrintFunctionName)) {
|
|
|
|
OS << Delimiter << LineInfo.FileName << ":" << LineInfo.Line;
|
|
|
|
LVP.printBetweenInsts(OS, true);
|
|
|
|
}
|
2020-02-22 07:30:51 +08:00
|
|
|
}
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
void SourcePrinter::printSources(formatted_raw_ostream &OS,
|
|
|
|
const DILineInfo &LineInfo,
|
|
|
|
StringRef ObjectFilename, StringRef Delimiter,
|
|
|
|
LiveVariablePrinter &LVP) {
|
2020-02-22 07:30:51 +08:00
|
|
|
if (LineInfo.FileName == DILineInfo::BadString || LineInfo.Line == 0 ||
|
|
|
|
(OldLineInfo.Line == LineInfo.Line &&
|
|
|
|
OldLineInfo.FileName == LineInfo.FileName))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (SourceCache.find(LineInfo.FileName) == SourceCache.end())
|
|
|
|
if (!cacheSource(LineInfo))
|
|
|
|
return;
|
|
|
|
auto LineBuffer = LineCache.find(LineInfo.FileName);
|
|
|
|
if (LineBuffer != LineCache.end()) {
|
|
|
|
if (LineInfo.Line > LineBuffer->second.size()) {
|
|
|
|
reportWarning(
|
|
|
|
formatv(
|
|
|
|
"debug info line number {0} exceeds the number of lines in {1}",
|
|
|
|
LineInfo.Line, LineInfo.FileName),
|
|
|
|
ObjectFilename);
|
|
|
|
return;
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
2020-02-22 07:30:51 +08:00
|
|
|
// Vector begins at 0, line numbers are non-zero
|
2020-03-17 22:21:42 +08:00
|
|
|
OS << Delimiter << LineBuffer->second[LineInfo.Line - 1];
|
|
|
|
LVP.printBetweenInsts(OS, true);
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-06-20 08:29:40 +08:00
|
|
|
static bool isAArch64Elf(const ObjectFile *Obj) {
|
|
|
|
const auto *Elf = dyn_cast<ELFObjectFileBase>(Obj);
|
|
|
|
return Elf && Elf->getEMachine() == ELF::EM_AARCH64;
|
|
|
|
}
|
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
static bool isArmElf(const ObjectFile *Obj) {
|
2019-06-20 08:29:40 +08:00
|
|
|
const auto *Elf = dyn_cast<ELFObjectFileBase>(Obj);
|
|
|
|
return Elf && Elf->getEMachine() == ELF::EM_ARM;
|
|
|
|
}
|
|
|
|
|
|
|
|
static bool hasMappingSymbols(const ObjectFile *Obj) {
|
|
|
|
return isArmElf(Obj) || isAArch64Elf(Obj);
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
static void printRelocation(formatted_raw_ostream &OS, StringRef FileName,
|
|
|
|
const RelocationRef &Rel, uint64_t Address,
|
|
|
|
bool Is64Bits) {
|
2019-06-17 17:59:55 +08:00
|
|
|
StringRef Fmt = Is64Bits ? "\t\t%016" PRIx64 ": " : "\t\t\t%08" PRIx64 ": ";
|
2019-01-23 21:39:12 +08:00
|
|
|
SmallString<16> Name;
|
|
|
|
SmallString<32> Val;
|
|
|
|
Rel.getTypeName(Name);
|
2019-08-21 19:07:31 +08:00
|
|
|
if (Error E = getRelocationValueString(Rel, Val))
|
|
|
|
reportError(std::move(E), FileName);
|
2020-03-17 22:21:42 +08:00
|
|
|
OS << format(Fmt.data(), Address) << Name << "\t" << Val;
|
2019-01-23 21:39:12 +08:00
|
|
|
}
|
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
class PrettyPrinter {
|
|
|
|
public:
|
|
|
|
virtual ~PrettyPrinter() = default;
|
2020-03-17 22:21:42 +08:00
|
|
|
virtual void
|
|
|
|
printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,
|
|
|
|
object::SectionedAddress Address, formatted_raw_ostream &OS,
|
|
|
|
StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,
|
|
|
|
StringRef ObjectFilename, std::vector<RelocationRef> *Rels,
|
|
|
|
LiveVariablePrinter &LVP) {
|
2018-05-15 03:46:08 +08:00
|
|
|
if (SP && (PrintSource || PrintLines))
|
2020-03-17 22:21:42 +08:00
|
|
|
SP->printSourceLine(OS, Address, ObjectFilename, LVP);
|
|
|
|
LVP.printBetweenInsts(OS, false);
|
2019-04-16 11:56:55 +08:00
|
|
|
|
[llvm-objdump][NFC] Make the PrettyPrinter::printInst() output buffered
Summary:
Every time PrettyPrinter::printInst is called, stdout is flushed and it makes llvm-objdump slow. This patches adds a string
buffer to prevent stdout from being flushed.
Benchmark results (./llvm-objdump-master: without this patch, ./bin/llvm-objcopy: with this patch):
$ hyperfine --warmup 10 './llvm-objdump-master -d ./bin/llvm-objcopy' './bin/llvm-objdump -d ./bin/llvm-objcopy'
Benchmark #1: ./llvm-objdump-master -d ./bin/llvm-objcopy
Time (mean ± σ): 2.230 s ± 0.050 s [User: 1.533 s, System: 0.682 s]
Range (min … max): 2.115 s … 2.278 s 10 runs
Benchmark #2: ./bin/llvm-objdump -d ./bin/llvm-objcopy
Time (mean ± σ): 386.4 ms ± 13.0 ms [User: 376.6 ms, System: 6.1 ms]
Range (min … max): 366.1 ms … 407.0 ms 10 runs
Summary
'./bin/llvm-objdump -d ./bin/llvm-objcopy' ran
5.77 ± 0.23 times faster than './llvm-objdump-master -d ./bin/llvm-objcopy'
Reviewers: alexshap, Bigcheese, jhenderson, rupprecht, grimar, MaskRay
Reviewed By: jhenderson, MaskRay
Subscribers: dexonsmith, jhenderson, javed.absar, kristof.beyls, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64969
llvm-svn: 366984
2019-07-25 14:38:27 +08:00
|
|
|
size_t Start = OS.tell();
|
|
|
|
if (!NoLeadingAddr)
|
|
|
|
OS << format("%8" PRIx64 ":", Address.Address);
|
|
|
|
if (!NoShowRawInsn) {
|
|
|
|
OS << ' ';
|
|
|
|
dumpBytes(Bytes, OS);
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
2019-04-16 11:56:55 +08:00
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
// The output of printInst starts with a tab. Print some spaces so that
|
|
|
|
// the tab has 1 column and advances to the target tab stop.
|
|
|
|
unsigned TabStop = getInstStartColumn(STI);
|
[llvm-objdump][NFC] Make the PrettyPrinter::printInst() output buffered
Summary:
Every time PrettyPrinter::printInst is called, stdout is flushed and it makes llvm-objdump slow. This patches adds a string
buffer to prevent stdout from being flushed.
Benchmark results (./llvm-objdump-master: without this patch, ./bin/llvm-objcopy: with this patch):
$ hyperfine --warmup 10 './llvm-objdump-master -d ./bin/llvm-objcopy' './bin/llvm-objdump -d ./bin/llvm-objcopy'
Benchmark #1: ./llvm-objdump-master -d ./bin/llvm-objcopy
Time (mean ± σ): 2.230 s ± 0.050 s [User: 1.533 s, System: 0.682 s]
Range (min … max): 2.115 s … 2.278 s 10 runs
Benchmark #2: ./bin/llvm-objdump -d ./bin/llvm-objcopy
Time (mean ± σ): 386.4 ms ± 13.0 ms [User: 376.6 ms, System: 6.1 ms]
Range (min … max): 366.1 ms … 407.0 ms 10 runs
Summary
'./bin/llvm-objdump -d ./bin/llvm-objcopy' ran
5.77 ± 0.23 times faster than './llvm-objdump-master -d ./bin/llvm-objcopy'
Reviewers: alexshap, Bigcheese, jhenderson, rupprecht, grimar, MaskRay
Reviewed By: jhenderson, MaskRay
Subscribers: dexonsmith, jhenderson, javed.absar, kristof.beyls, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64969
llvm-svn: 366984
2019-07-25 14:38:27 +08:00
|
|
|
unsigned Column = OS.tell() - Start;
|
|
|
|
OS.indent(Column < TabStop - 1 ? TabStop - 1 - Column : 7 - Column % 8);
|
|
|
|
|
2020-03-23 06:03:10 +08:00
|
|
|
if (MI) {
|
|
|
|
// See MCInstPrinter::printInst. On targets where a PC relative immediate
|
|
|
|
// is relative to the next instruction and the length of a MCInst is
|
|
|
|
// difficult to measure (x86), this is the address of the next
|
|
|
|
// instruction.
|
|
|
|
uint64_t Addr =
|
|
|
|
Address.Address + (STI.getTargetTriple().isX86() ? Bytes.size() : 0);
|
|
|
|
IP.printInst(MI, Addr, "", STI, OS);
|
|
|
|
} else
|
2019-04-10 13:31:21 +08:00
|
|
|
OS << "\t<unknown>";
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
};
|
|
|
|
PrettyPrinter PrettyPrinterInst;
|
2019-06-18 14:35:18 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
class HexagonPrettyPrinter : public PrettyPrinter {
|
|
|
|
public:
|
|
|
|
void printLead(ArrayRef<uint8_t> Bytes, uint64_t Address,
|
2020-03-17 22:21:42 +08:00
|
|
|
formatted_raw_ostream &OS) {
|
2018-05-15 03:46:08 +08:00
|
|
|
uint32_t opcode =
|
|
|
|
(Bytes[3] << 24) | (Bytes[2] << 16) | (Bytes[1] << 8) | Bytes[0];
|
|
|
|
if (!NoLeadingAddr)
|
|
|
|
OS << format("%8" PRIx64 ":", Address);
|
|
|
|
if (!NoShowRawInsn) {
|
|
|
|
OS << "\t";
|
|
|
|
dumpBytes(Bytes.slice(0, 4), OS);
|
2019-04-10 13:31:21 +08:00
|
|
|
OS << format("\t%08" PRIx32, opcode);
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
void printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,
|
2020-03-17 22:21:42 +08:00
|
|
|
object::SectionedAddress Address, formatted_raw_ostream &OS,
|
2019-02-27 21:17:36 +08:00
|
|
|
StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,
|
2020-03-17 22:21:42 +08:00
|
|
|
StringRef ObjectFilename, std::vector<RelocationRef> *Rels,
|
|
|
|
LiveVariablePrinter &LVP) override {
|
2018-05-15 03:46:08 +08:00
|
|
|
if (SP && (PrintSource || PrintLines))
|
2020-03-17 22:21:42 +08:00
|
|
|
SP->printSourceLine(OS, Address, ObjectFilename, LVP, "");
|
2018-05-15 03:46:08 +08:00
|
|
|
if (!MI) {
|
2019-02-27 21:17:36 +08:00
|
|
|
printLead(Bytes, Address.Address, OS);
|
2018-05-15 03:46:08 +08:00
|
|
|
OS << " <unknown>";
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
std::string Buffer;
|
|
|
|
{
|
|
|
|
raw_string_ostream TempStream(Buffer);
|
2020-01-04 02:55:30 +08:00
|
|
|
IP.printInst(MI, Address.Address, "", STI, TempStream);
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
StringRef Contents(Buffer);
|
|
|
|
// Split off bundle attributes
|
|
|
|
auto PacketBundle = Contents.rsplit('\n');
|
|
|
|
// Split off first instruction from the rest
|
|
|
|
auto HeadTail = PacketBundle.first.split('\n');
|
|
|
|
auto Preamble = " { ";
|
|
|
|
auto Separator = "";
|
|
|
|
|
|
|
|
// Hexagon's packets require relocations to be inline rather than
|
|
|
|
// clustered at the end of the packet.
|
2019-01-23 21:39:12 +08:00
|
|
|
std::vector<RelocationRef>::const_iterator RelCur = Rels->begin();
|
|
|
|
std::vector<RelocationRef>::const_iterator RelEnd = Rels->end();
|
2018-05-15 03:46:08 +08:00
|
|
|
auto PrintReloc = [&]() -> void {
|
2019-02-27 21:17:36 +08:00
|
|
|
while ((RelCur != RelEnd) && (RelCur->getOffset() <= Address.Address)) {
|
|
|
|
if (RelCur->getOffset() == Address.Address) {
|
2020-03-17 22:21:42 +08:00
|
|
|
printRelocation(OS, ObjectFilename, *RelCur, Address.Address, false);
|
2018-05-15 03:46:08 +08:00
|
|
|
return;
|
|
|
|
}
|
2019-01-15 17:19:18 +08:00
|
|
|
++RelCur;
|
2015-06-03 12:48:06 +08:00
|
|
|
}
|
2018-05-15 03:46:08 +08:00
|
|
|
};
|
2015-06-03 12:48:06 +08:00
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
while (!HeadTail.first.empty()) {
|
2018-05-15 03:46:08 +08:00
|
|
|
OS << Separator;
|
|
|
|
Separator = "\n";
|
|
|
|
if (SP && (PrintSource || PrintLines))
|
2020-03-17 22:21:42 +08:00
|
|
|
SP->printSourceLine(OS, Address, ObjectFilename, LVP, "");
|
2019-02-27 21:17:36 +08:00
|
|
|
printLead(Bytes, Address.Address, OS);
|
2018-05-15 03:46:08 +08:00
|
|
|
OS << Preamble;
|
|
|
|
Preamble = " ";
|
|
|
|
StringRef Inst;
|
|
|
|
auto Duplex = HeadTail.first.split('\v');
|
2019-01-15 17:19:18 +08:00
|
|
|
if (!Duplex.second.empty()) {
|
2018-05-15 03:46:08 +08:00
|
|
|
OS << Duplex.first;
|
|
|
|
OS << "; ";
|
|
|
|
Inst = Duplex.second;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
Inst = HeadTail.first;
|
|
|
|
OS << Inst;
|
|
|
|
HeadTail = HeadTail.second.split('\n');
|
|
|
|
if (HeadTail.first.empty())
|
|
|
|
OS << " } " << PacketBundle.second;
|
|
|
|
PrintReloc();
|
|
|
|
Bytes = Bytes.slice(4);
|
2019-02-27 21:17:36 +08:00
|
|
|
Address.Address += 4;
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
};
|
|
|
|
HexagonPrettyPrinter HexagonPrettyPrinterInst;
|
2015-06-03 12:48:06 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
class AMDGCNPrettyPrinter : public PrettyPrinter {
|
|
|
|
public:
|
|
|
|
void printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,
|
2020-03-17 22:21:42 +08:00
|
|
|
object::SectionedAddress Address, formatted_raw_ostream &OS,
|
2019-02-27 21:17:36 +08:00
|
|
|
StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,
|
2020-03-17 22:21:42 +08:00
|
|
|
StringRef ObjectFilename, std::vector<RelocationRef> *Rels,
|
|
|
|
LiveVariablePrinter &LVP) override {
|
2018-05-15 03:46:08 +08:00
|
|
|
if (SP && (PrintSource || PrintLines))
|
2020-03-17 22:21:42 +08:00
|
|
|
SP->printSourceLine(OS, Address, ObjectFilename, LVP);
|
2015-06-03 12:48:06 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
if (MI) {
|
|
|
|
SmallString<40> InstStr;
|
|
|
|
raw_svector_ostream IS(InstStr);
|
2015-06-03 12:48:06 +08:00
|
|
|
|
2020-01-04 02:55:30 +08:00
|
|
|
IP.printInst(MI, Address.Address, "", STI, IS);
|
2015-06-03 12:48:06 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
OS << left_justify(IS.str(), 60);
|
|
|
|
} else {
|
|
|
|
// an unrecognized encoding - this is probably data so represent it
|
|
|
|
// using the .long directive, or .byte directive if fewer than 4 bytes
|
|
|
|
// remaining
|
|
|
|
if (Bytes.size() >= 4) {
|
|
|
|
OS << format("\t.long 0x%08" PRIx32 " ",
|
2019-06-18 14:35:18 +08:00
|
|
|
support::endian::read32<support::little>(Bytes.data()));
|
2018-05-15 03:46:08 +08:00
|
|
|
OS.indent(42);
|
|
|
|
} else {
|
|
|
|
OS << format("\t.byte 0x%02" PRIx8, Bytes[0]);
|
|
|
|
for (unsigned int i = 1; i < Bytes.size(); i++)
|
|
|
|
OS << format(", 0x%02" PRIx8, Bytes[i]);
|
|
|
|
OS.indent(55 - (6 * Bytes.size()));
|
2015-06-03 12:48:06 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-06-18 14:35:18 +08:00
|
|
|
OS << format("// %012" PRIX64 ":", Address.Address);
|
|
|
|
if (Bytes.size() >= 4) {
|
|
|
|
// D should be casted to uint32_t here as it is passed by format to
|
|
|
|
// snprintf as vararg.
|
|
|
|
for (uint32_t D : makeArrayRef(
|
|
|
|
reinterpret_cast<const support::little32_t *>(Bytes.data()),
|
|
|
|
Bytes.size() / 4))
|
|
|
|
OS << format(" %08" PRIX32, D);
|
2018-05-15 03:46:08 +08:00
|
|
|
} else {
|
2019-06-18 14:35:18 +08:00
|
|
|
for (unsigned char B : Bytes)
|
|
|
|
OS << format(" %02" PRIX8, B);
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
2015-06-30 11:41:26 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
if (!Annot.empty())
|
2019-06-18 14:35:18 +08:00
|
|
|
OS << " // " << Annot;
|
2018-05-15 03:46:08 +08:00
|
|
|
}
|
|
|
|
};
|
|
|
|
AMDGCNPrettyPrinter AMDGCNPrettyPrinterInst;
|
2015-06-30 11:41:26 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
class BPFPrettyPrinter : public PrettyPrinter {
|
|
|
|
public:
|
|
|
|
void printInst(MCInstPrinter &IP, const MCInst *MI, ArrayRef<uint8_t> Bytes,
|
2020-03-17 22:21:42 +08:00
|
|
|
object::SectionedAddress Address, formatted_raw_ostream &OS,
|
2019-02-27 21:17:36 +08:00
|
|
|
StringRef Annot, MCSubtargetInfo const &STI, SourcePrinter *SP,
|
2020-03-17 22:21:42 +08:00
|
|
|
StringRef ObjectFilename, std::vector<RelocationRef> *Rels,
|
|
|
|
LiveVariablePrinter &LVP) override {
|
2018-05-15 03:46:08 +08:00
|
|
|
if (SP && (PrintSource || PrintLines))
|
2020-03-17 22:21:42 +08:00
|
|
|
SP->printSourceLine(OS, Address, ObjectFilename, LVP);
|
2018-05-15 03:46:08 +08:00
|
|
|
if (!NoLeadingAddr)
|
2019-02-27 21:17:36 +08:00
|
|
|
OS << format("%8" PRId64 ":", Address.Address / 8);
|
2018-05-15 03:46:08 +08:00
|
|
|
if (!NoShowRawInsn) {
|
|
|
|
OS << "\t";
|
|
|
|
dumpBytes(Bytes, OS);
|
2015-06-30 11:41:26 +08:00
|
|
|
}
|
2018-05-15 03:46:08 +08:00
|
|
|
if (MI)
|
2020-01-04 02:55:30 +08:00
|
|
|
IP.printInst(MI, Address.Address, "", STI, OS);
|
2018-05-15 03:46:08 +08:00
|
|
|
else
|
2019-04-10 13:31:21 +08:00
|
|
|
OS << "\t<unknown>";
|
2015-06-30 11:41:26 +08:00
|
|
|
}
|
2018-05-15 03:46:08 +08:00
|
|
|
};
|
|
|
|
BPFPrettyPrinter BPFPrettyPrinterInst;
|
2015-06-30 11:41:26 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
PrettyPrinter &selectPrettyPrinter(Triple const &Triple) {
|
|
|
|
switch(Triple.getArch()) {
|
|
|
|
default:
|
|
|
|
return PrettyPrinterInst;
|
|
|
|
case Triple::hexagon:
|
|
|
|
return HexagonPrettyPrinterInst;
|
|
|
|
case Triple::amdgcn:
|
|
|
|
return AMDGCNPrettyPrinterInst;
|
|
|
|
case Triple::bpfel:
|
|
|
|
case Triple::bpfeb:
|
|
|
|
return BPFPrettyPrinterInst;
|
|
|
|
}
|
|
|
|
}
|
2015-06-30 11:41:26 +08:00
|
|
|
}
|
|
|
|
|
2016-08-17 18:17:57 +08:00
|
|
|
static uint8_t getElfSymbolType(const ObjectFile *Obj, const SymbolRef &Sym) {
|
|
|
|
assert(Obj->isELF());
|
|
|
|
if (auto *Elf32LEObj = dyn_cast<ELF32LEObjectFile>(Obj))
|
2020-12-15 20:45:15 +08:00
|
|
|
return unwrapOrError(Elf32LEObj->getSymbol(Sym.getRawDataRefImpl()),
|
|
|
|
Obj->getFileName())
|
|
|
|
->getType();
|
2016-08-17 18:17:57 +08:00
|
|
|
if (auto *Elf64LEObj = dyn_cast<ELF64LEObjectFile>(Obj))
|
2020-12-15 20:45:15 +08:00
|
|
|
return unwrapOrError(Elf64LEObj->getSymbol(Sym.getRawDataRefImpl()),
|
|
|
|
Obj->getFileName())
|
|
|
|
->getType();
|
2016-08-17 18:17:57 +08:00
|
|
|
if (auto *Elf32BEObj = dyn_cast<ELF32BEObjectFile>(Obj))
|
2020-12-15 20:45:15 +08:00
|
|
|
return unwrapOrError(Elf32BEObj->getSymbol(Sym.getRawDataRefImpl()),
|
|
|
|
Obj->getFileName())
|
|
|
|
->getType();
|
2016-08-17 18:17:57 +08:00
|
|
|
if (auto *Elf64BEObj = cast<ELF64BEObjectFile>(Obj))
|
2020-12-15 20:45:15 +08:00
|
|
|
return unwrapOrError(Elf64BEObj->getSymbol(Sym.getRawDataRefImpl()),
|
|
|
|
Obj->getFileName())
|
|
|
|
->getType();
|
2016-08-17 18:17:57 +08:00
|
|
|
llvm_unreachable("Unsupported binary format");
|
|
|
|
}
|
|
|
|
|
2017-02-08 17:44:18 +08:00
|
|
|
template <class ELFT> static void
|
|
|
|
addDynamicElfSymbols(const ELFObjectFile<ELFT> *Obj,
|
|
|
|
std::map<SectionRef, SectionSymbolsTy> &AllSymbols) {
|
|
|
|
for (auto Symbol : Obj->getDynamicSymbolIterators()) {
|
|
|
|
uint8_t SymbolType = Symbol.getELFType();
|
2019-06-25 01:47:56 +08:00
|
|
|
if (SymbolType == ELF::STT_SECTION)
|
2017-02-08 17:44:18 +08:00
|
|
|
continue;
|
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
uint64_t Address = unwrapOrError(Symbol.getAddress(), Obj->getFileName());
|
2019-06-25 01:47:56 +08:00
|
|
|
// ELFSymbolRef::getAddress() returns size instead of value for common
|
|
|
|
// symbols which is not desirable for disassembly output. Overriding.
|
|
|
|
if (SymbolType == ELF::STT_COMMON)
|
2020-12-15 20:45:15 +08:00
|
|
|
Address = unwrapOrError(Obj->getSymbol(Symbol.getRawDataRefImpl()),
|
|
|
|
Obj->getFileName())
|
|
|
|
->st_value;
|
2019-06-25 01:47:56 +08:00
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
StringRef Name = unwrapOrError(Symbol.getName(), Obj->getFileName());
|
|
|
|
if (Name.empty())
|
2017-02-08 17:44:18 +08:00
|
|
|
continue;
|
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
section_iterator SecI =
|
|
|
|
unwrapOrError(Symbol.getSection(), Obj->getFileName());
|
2017-02-08 17:44:18 +08:00
|
|
|
if (SecI == Obj->section_end())
|
|
|
|
continue;
|
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
AllSymbols[*SecI].emplace_back(Address, Name, SymbolType);
|
2017-02-08 17:44:18 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
addDynamicElfSymbols(const ObjectFile *Obj,
|
|
|
|
std::map<SectionRef, SectionSymbolsTy> &AllSymbols) {
|
|
|
|
assert(Obj->isELF());
|
|
|
|
if (auto *Elf32LEObj = dyn_cast<ELF32LEObjectFile>(Obj))
|
|
|
|
addDynamicElfSymbols(Elf32LEObj, AllSymbols);
|
|
|
|
else if (auto *Elf64LEObj = dyn_cast<ELF64LEObjectFile>(Obj))
|
|
|
|
addDynamicElfSymbols(Elf64LEObj, AllSymbols);
|
|
|
|
else if (auto *Elf32BEObj = dyn_cast<ELF32BEObjectFile>(Obj))
|
|
|
|
addDynamicElfSymbols(Elf32BEObj, AllSymbols);
|
|
|
|
else if (auto *Elf64BEObj = cast<ELF64BEObjectFile>(Obj))
|
|
|
|
addDynamicElfSymbols(Elf64BEObj, AllSymbols);
|
|
|
|
else
|
|
|
|
llvm_unreachable("Unsupported binary format");
|
|
|
|
}
|
|
|
|
|
2018-08-24 23:21:57 +08:00
|
|
|
static void addPltEntries(const ObjectFile *Obj,
|
|
|
|
std::map<SectionRef, SectionSymbolsTy> &AllSymbols,
|
|
|
|
StringSaver &Saver) {
|
|
|
|
Optional<SectionRef> Plt = None;
|
|
|
|
for (const SectionRef &Section : Obj->sections()) {
|
2019-08-14 19:10:11 +08:00
|
|
|
Expected<StringRef> SecNameOrErr = Section.getName();
|
|
|
|
if (!SecNameOrErr) {
|
|
|
|
consumeError(SecNameOrErr.takeError());
|
2018-08-24 23:21:57 +08:00
|
|
|
continue;
|
2019-08-14 19:10:11 +08:00
|
|
|
}
|
|
|
|
if (*SecNameOrErr == ".plt")
|
2018-08-24 23:21:57 +08:00
|
|
|
Plt = Section;
|
|
|
|
}
|
|
|
|
if (!Plt)
|
|
|
|
return;
|
|
|
|
if (auto *ElfObj = dyn_cast<ELFObjectFileBase>(Obj)) {
|
|
|
|
for (auto PltEntry : ElfObj->getPltAddresses()) {
|
2020-08-13 23:13:26 +08:00
|
|
|
if (PltEntry.first) {
|
|
|
|
SymbolRef Symbol(*PltEntry.first, ElfObj);
|
|
|
|
uint8_t SymbolType = getElfSymbolType(Obj, Symbol);
|
|
|
|
if (Expected<StringRef> NameOrErr = Symbol.getName()) {
|
|
|
|
if (!NameOrErr->empty())
|
|
|
|
AllSymbols[*Plt].emplace_back(
|
|
|
|
PltEntry.second, Saver.save((*NameOrErr + "@plt").str()),
|
|
|
|
SymbolType);
|
|
|
|
continue;
|
|
|
|
} else {
|
|
|
|
// The warning has been reported in disassembleObject().
|
|
|
|
consumeError(NameOrErr.takeError());
|
|
|
|
}
|
|
|
|
}
|
|
|
|
reportWarning("PLT entry at 0x" + Twine::utohexstr(PltEntry.second) +
|
|
|
|
" references an invalid symbol",
|
|
|
|
Obj->getFileName());
|
2018-08-24 23:21:57 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-01-10 22:55:26 +08:00
|
|
|
// Normally the disassembly output will skip blocks of zeroes. This function
|
|
|
|
// returns the number of zero bytes that can be skipped when dumping the
|
|
|
|
// disassembly of the instructions in Buf.
|
|
|
|
static size_t countSkippableZeroBytes(ArrayRef<uint8_t> Buf) {
|
|
|
|
// Find the number of leading zeroes.
|
|
|
|
size_t N = 0;
|
|
|
|
while (N < Buf.size() && !Buf[N])
|
|
|
|
++N;
|
|
|
|
|
|
|
|
// We may want to skip blocks of zero bytes, but unless we see
|
|
|
|
// at least 8 of them in a row.
|
|
|
|
if (N < 8)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
// We skip zeroes in multiples of 4 because do not want to truncate an
|
|
|
|
// instruction if it starts with a zero byte.
|
|
|
|
return N & ~0x3;
|
|
|
|
}
|
|
|
|
|
2019-01-22 22:09:37 +08:00
|
|
|
// Returns a map from sections to their relocations.
|
|
|
|
static std::map<SectionRef, std::vector<RelocationRef>>
|
2019-04-15 23:31:42 +08:00
|
|
|
getRelocsMap(object::ObjectFile const &Obj) {
|
2019-01-22 22:09:37 +08:00
|
|
|
std::map<SectionRef, std::vector<RelocationRef>> Ret;
|
2019-10-21 19:06:38 +08:00
|
|
|
uint64_t I = (uint64_t)-1;
|
2019-05-22 23:12:51 +08:00
|
|
|
for (SectionRef Sec : Obj.sections()) {
|
2019-10-21 19:06:38 +08:00
|
|
|
++I;
|
|
|
|
Expected<section_iterator> RelocatedOrErr = Sec.getRelocatedSection();
|
|
|
|
if (!RelocatedOrErr)
|
|
|
|
reportError(Obj.getFileName(),
|
|
|
|
"section (" + Twine(I) +
|
|
|
|
"): failed to get a relocated section: " +
|
|
|
|
toString(RelocatedOrErr.takeError()));
|
|
|
|
|
|
|
|
section_iterator Relocated = *RelocatedOrErr;
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
if (Relocated == Obj.section_end() || !checkSectionFilter(*Relocated).Keep)
|
2019-01-22 22:09:37 +08:00
|
|
|
continue;
|
2019-05-22 23:12:51 +08:00
|
|
|
std::vector<RelocationRef> &V = Ret[*Relocated];
|
2021-01-27 12:00:19 +08:00
|
|
|
append_range(V, Sec.relocations());
|
2019-01-22 22:09:37 +08:00
|
|
|
// Sort relocations by address.
|
2019-05-22 23:12:51 +08:00
|
|
|
llvm::stable_sort(V, isRelocAddressLess);
|
2019-01-22 22:09:37 +08:00
|
|
|
}
|
|
|
|
return Ret;
|
|
|
|
}
|
|
|
|
|
2019-01-28 18:44:01 +08:00
|
|
|
// Used for --adjust-vma to check if address should be adjusted by the
|
|
|
|
// specified value for a given section.
|
|
|
|
// For ELF we do not adjust non-allocatable sections like debug ones,
|
|
|
|
// because they are not loadable.
|
|
|
|
// TODO: implement for other file formats.
|
|
|
|
static bool shouldAdjustVA(const SectionRef &Section) {
|
|
|
|
const ObjectFile *Obj = Section.getObject();
|
2020-04-05 12:31:22 +08:00
|
|
|
if (Obj->isELF())
|
2019-01-28 18:44:01 +08:00
|
|
|
return ELFSectionRef(Section).getFlags() & ELF::SHF_ALLOC;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2019-06-20 08:29:40 +08:00
|
|
|
|
|
|
|
typedef std::pair<uint64_t, char> MappingSymbolPair;
|
|
|
|
static char getMappingSymbolKind(ArrayRef<MappingSymbolPair> MappingSymbols,
|
|
|
|
uint64_t Address) {
|
2019-06-30 19:19:56 +08:00
|
|
|
auto It =
|
|
|
|
partition_point(MappingSymbols, [Address](const MappingSymbolPair &Val) {
|
|
|
|
return Val.first <= Address;
|
|
|
|
});
|
2019-06-20 08:29:40 +08:00
|
|
|
// Return zero for any address before the first mapping symbol; this means
|
|
|
|
// we should use the default disassembly mode, depending on the target.
|
2019-06-30 19:19:56 +08:00
|
|
|
if (It == MappingSymbols.begin())
|
2019-06-20 08:29:40 +08:00
|
|
|
return '\x00';
|
2019-06-30 19:19:56 +08:00
|
|
|
return (It - 1)->second;
|
2019-06-20 08:29:40 +08:00
|
|
|
}
|
|
|
|
|
2020-05-02 13:52:42 +08:00
|
|
|
static uint64_t dumpARMELFData(uint64_t SectionAddr, uint64_t Index,
|
|
|
|
uint64_t End, const ObjectFile *Obj,
|
|
|
|
ArrayRef<uint8_t> Bytes,
|
2020-03-17 22:21:42 +08:00
|
|
|
ArrayRef<MappingSymbolPair> MappingSymbols,
|
|
|
|
raw_ostream &OS) {
|
2019-04-08 00:33:24 +08:00
|
|
|
support::endianness Endian =
|
|
|
|
Obj->isLittleEndian() ? support::little : support::big;
|
2020-03-17 22:21:42 +08:00
|
|
|
OS << format("%8" PRIx64 ":\t", SectionAddr + Index);
|
2020-05-02 13:52:42 +08:00
|
|
|
if (Index + 4 <= End) {
|
2020-03-17 22:21:42 +08:00
|
|
|
dumpBytes(Bytes.slice(Index, 4), OS);
|
|
|
|
OS << "\t.word\t"
|
2020-05-02 13:52:42 +08:00
|
|
|
<< format_hex(support::endian::read32(Bytes.data() + Index, Endian),
|
|
|
|
10);
|
|
|
|
return 4;
|
2019-04-08 00:33:24 +08:00
|
|
|
}
|
2020-05-02 13:52:42 +08:00
|
|
|
if (Index + 2 <= End) {
|
2020-03-17 22:21:42 +08:00
|
|
|
dumpBytes(Bytes.slice(Index, 2), OS);
|
|
|
|
OS << "\t\t.short\t"
|
2020-05-02 13:52:42 +08:00
|
|
|
<< format_hex(support::endian::read16(Bytes.data() + Index, Endian),
|
|
|
|
6);
|
|
|
|
return 2;
|
|
|
|
}
|
2020-03-17 22:21:42 +08:00
|
|
|
dumpBytes(Bytes.slice(Index, 1), OS);
|
|
|
|
OS << "\t\t.byte\t" << format_hex(Bytes[0], 4);
|
2020-05-02 13:52:42 +08:00
|
|
|
return 1;
|
2019-04-08 00:33:24 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static void dumpELFData(uint64_t SectionAddr, uint64_t Index, uint64_t End,
|
|
|
|
ArrayRef<uint8_t> Bytes) {
|
|
|
|
// print out data up to 8 bytes at a time in hex and ascii
|
|
|
|
uint8_t AsciiData[9] = {'\0'};
|
|
|
|
uint8_t Byte;
|
|
|
|
int NumBytes = 0;
|
|
|
|
|
|
|
|
for (; Index < End; ++Index) {
|
2019-06-20 02:44:29 +08:00
|
|
|
if (NumBytes == 0)
|
2019-04-08 00:33:24 +08:00
|
|
|
outs() << format("%8" PRIx64 ":", SectionAddr + Index);
|
|
|
|
Byte = Bytes.slice(Index)[0];
|
|
|
|
outs() << format(" %02x", Byte);
|
|
|
|
AsciiData[NumBytes] = isPrint(Byte) ? Byte : '.';
|
|
|
|
|
|
|
|
uint8_t IndentOffset = 0;
|
|
|
|
NumBytes++;
|
|
|
|
if (Index == End - 1 || NumBytes > 8) {
|
|
|
|
// Indent the space for less than 8 bytes data.
|
|
|
|
// 2 spaces for byte and one for space between bytes
|
|
|
|
IndentOffset = 3 * (8 - NumBytes);
|
|
|
|
for (int Excess = NumBytes; Excess < 8; Excess++)
|
|
|
|
AsciiData[Excess] = '\0';
|
|
|
|
NumBytes = 8;
|
|
|
|
}
|
|
|
|
if (NumBytes == 8) {
|
|
|
|
AsciiData[8] = '\0';
|
|
|
|
outs() << std::string(IndentOffset, ' ') << " ";
|
|
|
|
outs() << reinterpret_cast<char *>(AsciiData);
|
|
|
|
outs() << '\n';
|
|
|
|
NumBytes = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
SymbolInfoTy objdump::createSymbolInfo(const ObjectFile *Obj,
|
|
|
|
const SymbolRef &Symbol) {
|
2020-04-06 22:09:12 +08:00
|
|
|
const StringRef FileName = Obj->getFileName();
|
|
|
|
const uint64_t Addr = unwrapOrError(Symbol.getAddress(), FileName);
|
|
|
|
const StringRef Name = unwrapOrError(Symbol.getName(), FileName);
|
|
|
|
|
|
|
|
if (Obj->isXCOFF() && SymbolDescription) {
|
|
|
|
const auto *XCOFFObj = cast<XCOFFObjectFile>(Obj);
|
|
|
|
DataRefImpl SymbolDRI = Symbol.getRawDataRefImpl();
|
|
|
|
|
|
|
|
const uint32_t SymbolIndex = XCOFFObj->getSymbolIndex(SymbolDRI.p);
|
|
|
|
Optional<XCOFF::StorageMappingClass> Smc =
|
|
|
|
getXCOFFSymbolCsectSMC(XCOFFObj, Symbol);
|
|
|
|
return SymbolInfoTy(Addr, Name, Smc, SymbolIndex,
|
|
|
|
isLabel(XCOFFObj, Symbol));
|
|
|
|
} else
|
|
|
|
return SymbolInfoTy(Addr, Name,
|
|
|
|
Obj->isELF() ? getElfSymbolType(Obj, Symbol)
|
|
|
|
: (uint8_t)ELF::STT_NOTYPE);
|
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
static SymbolInfoTy createDummySymbolInfo(const ObjectFile *Obj,
|
|
|
|
const uint64_t Addr, StringRef &Name,
|
|
|
|
uint8_t Type) {
|
2020-04-06 22:09:12 +08:00
|
|
|
if (Obj->isXCOFF() && SymbolDescription)
|
|
|
|
return SymbolInfoTy(Addr, Name, None, None, false);
|
|
|
|
else
|
|
|
|
return SymbolInfoTy(Addr, Name, Type);
|
|
|
|
}
|
|
|
|
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
static void
|
|
|
|
collectLocalBranchTargets(ArrayRef<uint8_t> Bytes, const MCInstrAnalysis *MIA,
|
|
|
|
MCDisassembler *DisAsm, MCInstPrinter *IP,
|
|
|
|
const MCSubtargetInfo *STI, uint64_t SectionAddr,
|
|
|
|
uint64_t Start, uint64_t End,
|
|
|
|
std::unordered_map<uint64_t, std::string> &Labels) {
|
|
|
|
// So far only supports X86.
|
|
|
|
if (!STI->getTargetTriple().isX86())
|
|
|
|
return;
|
|
|
|
|
|
|
|
Labels.clear();
|
|
|
|
unsigned LabelCount = 0;
|
|
|
|
Start += SectionAddr;
|
|
|
|
End += SectionAddr;
|
|
|
|
uint64_t Index = Start;
|
|
|
|
while (Index < End) {
|
|
|
|
// Disassemble a real instruction and record function-local branch labels.
|
|
|
|
MCInst Inst;
|
|
|
|
uint64_t Size;
|
|
|
|
bool Disassembled = DisAsm->getInstruction(
|
|
|
|
Inst, Size, Bytes.slice(Index - SectionAddr), Index, nulls());
|
|
|
|
if (Size == 0)
|
|
|
|
Size = 1;
|
|
|
|
|
|
|
|
if (Disassembled && MIA) {
|
|
|
|
uint64_t Target;
|
|
|
|
bool TargetKnown = MIA->evaluateBranch(Inst, Index, Size, Target);
|
|
|
|
if (TargetKnown && (Target >= Start && Target < End) &&
|
|
|
|
!Labels.count(Target))
|
|
|
|
Labels[Target] = ("L" + Twine(LabelCount++)).str();
|
|
|
|
}
|
|
|
|
|
|
|
|
Index += Size;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-09-04 08:07:59 +08:00
|
|
|
static StringRef getSegmentName(const MachOObjectFile *MachO,
|
|
|
|
const SectionRef &Section) {
|
|
|
|
if (MachO) {
|
|
|
|
DataRefImpl DR = Section.getRawDataRefImpl();
|
|
|
|
StringRef SegmentName = MachO->getSectionFinalSegmentName(DR);
|
|
|
|
return SegmentName;
|
|
|
|
}
|
|
|
|
return "";
|
|
|
|
}
|
|
|
|
|
2019-01-23 18:33:26 +08:00
|
|
|
static void disassembleObject(const Target *TheTarget, const ObjectFile *Obj,
|
2019-06-20 08:29:40 +08:00
|
|
|
MCContext &Ctx, MCDisassembler *PrimaryDisAsm,
|
|
|
|
MCDisassembler *SecondaryDisAsm,
|
2019-01-23 18:33:26 +08:00
|
|
|
const MCInstrAnalysis *MIA, MCInstPrinter *IP,
|
2019-06-20 08:29:40 +08:00
|
|
|
const MCSubtargetInfo *PrimarySTI,
|
|
|
|
const MCSubtargetInfo *SecondarySTI,
|
|
|
|
PrettyPrinter &PIP,
|
2019-01-23 18:33:26 +08:00
|
|
|
SourcePrinter &SP, bool InlineRelocs) {
|
2019-06-20 08:29:40 +08:00
|
|
|
const MCSubtargetInfo *STI = PrimarySTI;
|
|
|
|
MCDisassembler *DisAsm = PrimaryDisAsm;
|
|
|
|
bool PrimaryIsThumb = false;
|
|
|
|
if (isArmElf(Obj))
|
|
|
|
PrimaryIsThumb = STI->checkFeatures("+thumb-mode");
|
|
|
|
|
2019-01-22 22:09:37 +08:00
|
|
|
std::map<SectionRef, std::vector<RelocationRef>> RelocMap;
|
|
|
|
if (InlineRelocs)
|
|
|
|
RelocMap = getRelocsMap(*Obj);
|
2019-06-17 17:59:55 +08:00
|
|
|
bool Is64Bits = Obj->getBytesInAddress() > 4;
|
2014-01-26 01:38:19 +08:00
|
|
|
|
2015-07-08 06:06:59 +08:00
|
|
|
// Create a mapping from virtual address to symbol name. This is used to
|
2015-11-18 10:49:19 +08:00
|
|
|
// pretty print the symbols while disassembling.
|
|
|
|
std::map<SectionRef, SectionSymbolsTy> AllSymbols;
|
2018-06-29 02:57:13 +08:00
|
|
|
SectionSymbolsTy AbsoluteSymbols;
|
2019-04-07 16:19:55 +08:00
|
|
|
const StringRef FileName = Obj->getFileName();
|
llvm-objdump should ignore Mach-O stab symbols for disassembly.
Summary:
llvm-objdump will commonly error out when disassembling a Mach-O binary with
stab symbols, or when printing a Mach-O symbol table that includesstab symbols.
That is because the Mach-O N_OSO symbol has been modified to include the
bottom 8-bit value of the Mach-O's cpusubtype value in the section field. In
general, one cannot blindly assume a stab symbol's section field is valid
unless one has actually consulted the specification for the specific stab.
Since objdump mostly just walks the symbol table to get mnemonics for code
disassembly it's best for objdump to just ignore stab symbols. llvm-nm will
do a more complete and correct job of displaying Mach-O symbol table contents.
Reviewers: pete, lhames, ab, thegameg, jhenderson, MaskRay
Reviewed By: thegameg, MaskRay
Subscribers: MaskRay, rupprecht, seiya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71394
2019-12-12 14:43:46 +08:00
|
|
|
const MachOObjectFile *MachO = dyn_cast<const MachOObjectFile>(Obj);
|
2015-11-18 10:49:19 +08:00
|
|
|
for (const SymbolRef &Symbol : Obj->symbols()) {
|
2020-08-13 23:13:26 +08:00
|
|
|
Expected<StringRef> NameOrErr = Symbol.getName();
|
|
|
|
if (!NameOrErr) {
|
|
|
|
reportWarning(toString(NameOrErr.takeError()), FileName);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (NameOrErr->empty() && !(Obj->isXCOFF() && SymbolDescription))
|
2019-04-07 16:19:55 +08:00
|
|
|
continue;
|
2016-08-26 03:41:08 +08:00
|
|
|
|
2020-04-06 22:09:12 +08:00
|
|
|
if (Obj->isELF() && getElfSymbolType(Obj, Symbol) == ELF::STT_SECTION)
|
|
|
|
continue;
|
2016-08-17 18:17:57 +08:00
|
|
|
|
2020-04-06 22:09:12 +08:00
|
|
|
// Don't ask a Mach-O STAB symbol for its section unless you know that
|
llvm-objdump should ignore Mach-O stab symbols for disassembly.
Summary:
llvm-objdump will commonly error out when disassembling a Mach-O binary with
stab symbols, or when printing a Mach-O symbol table that includesstab symbols.
That is because the Mach-O N_OSO symbol has been modified to include the
bottom 8-bit value of the Mach-O's cpusubtype value in the section field. In
general, one cannot blindly assume a stab symbol's section field is valid
unless one has actually consulted the specification for the specific stab.
Since objdump mostly just walks the symbol table to get mnemonics for code
disassembly it's best for objdump to just ignore stab symbols. llvm-nm will
do a more complete and correct job of displaying Mach-O symbol table contents.
Reviewers: pete, lhames, ab, thegameg, jhenderson, MaskRay
Reviewed By: thegameg, MaskRay
Subscribers: MaskRay, rupprecht, seiya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71394
2019-12-12 14:43:46 +08:00
|
|
|
// STAB symbol's section field refers to a valid section index. Otherwise
|
|
|
|
// the symbol may error trying to load a section that does not exist.
|
|
|
|
if (MachO) {
|
|
|
|
DataRefImpl SymDRI = Symbol.getRawDataRefImpl();
|
|
|
|
uint8_t NType = (MachO->is64Bit() ?
|
|
|
|
MachO->getSymbol64TableEntry(SymDRI).n_type:
|
|
|
|
MachO->getSymbolTableEntry(SymDRI).n_type);
|
|
|
|
if (NType & MachO::N_STAB)
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
section_iterator SecI = unwrapOrError(Symbol.getSection(), FileName);
|
2018-06-29 02:57:13 +08:00
|
|
|
if (SecI != Obj->section_end())
|
2020-04-06 22:09:12 +08:00
|
|
|
AllSymbols[*SecI].push_back(createSymbolInfo(Obj, Symbol));
|
2018-06-29 02:57:13 +08:00
|
|
|
else
|
2020-04-06 22:09:12 +08:00
|
|
|
AbsoluteSymbols.push_back(createSymbolInfo(Obj, Symbol));
|
2015-11-18 10:49:19 +08:00
|
|
|
}
|
2020-04-06 22:09:12 +08:00
|
|
|
|
2017-02-08 17:44:18 +08:00
|
|
|
if (AllSymbols.empty() && Obj->isELF())
|
|
|
|
addDynamicElfSymbols(Obj, AllSymbols);
|
2015-11-18 10:49:19 +08:00
|
|
|
|
2018-08-24 23:21:57 +08:00
|
|
|
BumpPtrAllocator A;
|
|
|
|
StringSaver Saver(A);
|
|
|
|
addPltEntries(Obj, AllSymbols, Saver);
|
|
|
|
|
2020-03-25 06:55:34 +08:00
|
|
|
// Create a mapping from virtual address to section. An empty section can
|
2020-04-20 21:23:01 +08:00
|
|
|
// cause more than one section at the same address. Sort such sections to be
|
|
|
|
// before same-addressed non-empty sections so that symbol lookups prefer the
|
|
|
|
// non-empty section.
|
2015-11-18 10:49:19 +08:00
|
|
|
std::vector<std::pair<uint64_t, SectionRef>> SectionAddresses;
|
|
|
|
for (SectionRef Sec : Obj->sections())
|
|
|
|
SectionAddresses.emplace_back(Sec.getAddress(), Sec);
|
2020-04-20 21:23:01 +08:00
|
|
|
llvm::stable_sort(SectionAddresses, [](const auto &LHS, const auto &RHS) {
|
|
|
|
if (LHS.first != RHS.first)
|
|
|
|
return LHS.first < RHS.first;
|
|
|
|
return LHS.second.getSize() < RHS.second.getSize();
|
|
|
|
});
|
2015-11-18 10:49:19 +08:00
|
|
|
|
|
|
|
// Linked executables (.exe and .dll files) typically don't include a real
|
|
|
|
// symbol table but they might contain an export table.
|
|
|
|
if (const auto *COFFObj = dyn_cast<COFFObjectFile>(Obj)) {
|
|
|
|
for (const auto &ExportEntry : COFFObj->export_directories()) {
|
|
|
|
StringRef Name;
|
2020-06-12 04:00:54 +08:00
|
|
|
if (Error E = ExportEntry.getSymbolName(Name))
|
|
|
|
reportError(std::move(E), Obj->getFileName());
|
2015-11-18 10:49:19 +08:00
|
|
|
if (Name.empty())
|
2015-07-08 06:06:59 +08:00
|
|
|
continue;
|
2019-08-21 19:07:31 +08:00
|
|
|
|
2015-11-18 10:49:19 +08:00
|
|
|
uint32_t RVA;
|
2020-06-12 04:00:54 +08:00
|
|
|
if (Error E = ExportEntry.getExportRVA(RVA))
|
|
|
|
reportError(std::move(E), Obj->getFileName());
|
2015-11-18 10:49:19 +08:00
|
|
|
|
|
|
|
uint64_t VA = COFFObj->getImageBase() + RVA;
|
2019-06-30 19:19:56 +08:00
|
|
|
auto Sec = partition_point(
|
|
|
|
SectionAddresses, [VA](const std::pair<uint64_t, SectionRef> &O) {
|
|
|
|
return O.first <= VA;
|
2015-11-18 10:49:19 +08:00
|
|
|
});
|
2019-04-07 13:32:16 +08:00
|
|
|
if (Sec != SectionAddresses.begin()) {
|
2015-11-18 10:49:19 +08:00
|
|
|
--Sec;
|
2016-08-17 18:17:57 +08:00
|
|
|
AllSymbols[Sec->second].emplace_back(VA, Name, ELF::STT_NOTYPE);
|
2019-04-07 13:32:16 +08:00
|
|
|
} else
|
2018-06-29 02:57:13 +08:00
|
|
|
AbsoluteSymbols.emplace_back(VA, Name, ELF::STT_NOTYPE);
|
2015-11-18 10:49:19 +08:00
|
|
|
}
|
2015-07-08 06:06:59 +08:00
|
|
|
}
|
|
|
|
|
2015-11-18 10:49:19 +08:00
|
|
|
// Sort all the symbols, this allows us to use a simple binary search to find
|
2020-03-27 00:07:37 +08:00
|
|
|
// Multiple symbols can have the same address. Use a stable sort to stabilize
|
2020-03-25 06:55:34 +08:00
|
|
|
// the output.
|
2020-03-08 04:55:44 +08:00
|
|
|
StringSet<> FoundDisasmSymbolSet;
|
2015-11-18 10:49:19 +08:00
|
|
|
for (std::pair<const SectionRef, SectionSymbolsTy> &SecSyms : AllSymbols)
|
2020-10-08 18:43:34 +08:00
|
|
|
llvm::stable_sort(SecSyms.second);
|
|
|
|
llvm::stable_sort(AbsoluteSymbols);
|
2015-11-18 10:49:19 +08:00
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
std::unique_ptr<DWARFContext> DICtx;
|
|
|
|
LiveVariablePrinter LVP(*Ctx.getRegisterInfo(), *STI);
|
|
|
|
|
|
|
|
if (DbgVariables != DVDisabled) {
|
|
|
|
DICtx = DWARFContext::create(*Obj);
|
|
|
|
for (const std::unique_ptr<DWARFUnit> &CU : DICtx->compile_units())
|
|
|
|
LVP.addCompileUnit(CU->getUnitDIE(false));
|
|
|
|
}
|
|
|
|
|
|
|
|
LLVM_DEBUG(LVP.dump());
|
|
|
|
|
2015-07-29 23:45:39 +08:00
|
|
|
for (const SectionRef &Section : ToolSectionFilter(*Obj)) {
|
2019-06-05 19:37:53 +08:00
|
|
|
if (FilterSections.empty() && !DisassembleAll &&
|
|
|
|
(!Section.isText() || Section.isVirtual()))
|
2014-01-25 08:32:01 +08:00
|
|
|
continue;
|
2011-06-26 01:55:23 +08:00
|
|
|
|
2014-10-08 23:28:58 +08:00
|
|
|
uint64_t SectionAddr = Section.getAddress();
|
|
|
|
uint64_t SectSize = Section.getSize();
|
2014-11-11 17:58:25 +08:00
|
|
|
if (!SectSize)
|
|
|
|
continue;
|
2014-02-25 06:12:11 +08:00
|
|
|
|
2015-11-18 10:49:19 +08:00
|
|
|
// Get the list of all the symbols in this section.
|
|
|
|
SectionSymbolsTy &Symbols = AllSymbols[Section];
|
2019-06-20 08:29:40 +08:00
|
|
|
std::vector<MappingSymbolPair> MappingSymbols;
|
|
|
|
if (hasMappingSymbols(Obj)) {
|
2015-11-18 10:49:19 +08:00
|
|
|
for (const auto &Symb : Symbols) {
|
2020-02-11 08:23:01 +08:00
|
|
|
uint64_t Address = Symb.Addr;
|
|
|
|
StringRef Name = Symb.Name;
|
2015-11-18 10:49:19 +08:00
|
|
|
if (Name.startswith("$d"))
|
2019-06-20 08:29:40 +08:00
|
|
|
MappingSymbols.emplace_back(Address - SectionAddr, 'd');
|
2015-11-18 10:49:19 +08:00
|
|
|
if (Name.startswith("$x"))
|
2019-06-20 08:29:40 +08:00
|
|
|
MappingSymbols.emplace_back(Address - SectionAddr, 'x');
|
2016-08-26 03:41:08 +08:00
|
|
|
if (Name.startswith("$a"))
|
2019-06-20 08:29:40 +08:00
|
|
|
MappingSymbols.emplace_back(Address - SectionAddr, 'a');
|
2016-08-26 03:41:08 +08:00
|
|
|
if (Name.startswith("$t"))
|
2019-06-20 08:29:40 +08:00
|
|
|
MappingSymbols.emplace_back(Address - SectionAddr, 't');
|
2011-07-16 02:39:24 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-06-20 08:29:40 +08:00
|
|
|
llvm::sort(MappingSymbols);
|
2011-07-16 02:39:24 +08:00
|
|
|
|
2016-10-06 21:46:08 +08:00
|
|
|
if (Obj->isELF() && Obj->getArch() == Triple::amdgcn) {
|
|
|
|
// AMDGPU disassembler uses symbolizer for printing labels
|
|
|
|
std::unique_ptr<MCRelocationInfo> RelInfo(
|
|
|
|
TheTarget->createMCRelocationInfo(TripleName, Ctx));
|
|
|
|
if (RelInfo) {
|
|
|
|
std::unique_ptr<MCSymbolizer> Symbolizer(
|
|
|
|
TheTarget->createMCSymbolizer(
|
|
|
|
TripleName, nullptr, nullptr, &Symbols, &Ctx, std::move(RelInfo)));
|
|
|
|
DisAsm->setSymbolizer(std::move(Symbolizer));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-09-04 08:07:59 +08:00
|
|
|
StringRef SegmentName = getSegmentName(MachO, Section);
|
2019-08-14 19:10:11 +08:00
|
|
|
StringRef SectionName = unwrapOrError(Section.getName(), Obj->getFileName());
|
2015-06-04 23:01:05 +08:00
|
|
|
// If the section has no symbol at the start, just insert a dummy one.
|
2020-02-11 08:23:01 +08:00
|
|
|
if (Symbols.empty() || Symbols[0].Addr != 0) {
|
2020-04-06 22:09:12 +08:00
|
|
|
Symbols.insert(Symbols.begin(),
|
|
|
|
createDummySymbolInfo(Obj, SectionAddr, SectionName,
|
|
|
|
Section.isText() ? ELF::STT_FUNC
|
|
|
|
: ELF::STT_OBJECT));
|
2016-08-17 18:17:57 +08:00
|
|
|
}
|
2014-06-27 06:52:05 +08:00
|
|
|
|
|
|
|
SmallString<40> Comments;
|
|
|
|
raw_svector_ostream CommentStream(Comments);
|
Add MCSymbolizer for symbolic/annotated disassembly.
This is a basic first step towards symbolization of disassembled
instructions. This used to be done using externally provided (C API)
callbacks. This patch introduces:
- the MCSymbolizer class, that mimics the same functions that were used
in the X86 and ARM disassemblers to symbolize immediate operands and
to annotate loads based off PC (for things like c string literals).
- the MCExternalSymbolizer class, which implements the old C API.
- the MCRelocationInfo class, which provides a way for targets to
translate relocations (either object::RelocationRef, or disassembler
C API VariantKinds) to MCExprs.
- the MCObjectSymbolizer class, which does symbolization using what it
finds in an object::ObjectFile. This makes simple symbolization (with
no fancy relocation stuff) work for all object formats!
- x86-64 Mach-O and ELF MCRelocationInfos.
- A basic ARM Mach-O MCRelocationInfo, that provides just enough to
support the C API VariantKinds.
Most of what works in otool (the only user of the old symbolization API
that I know of) for x86-64 symbolic disassembly (-tvV) works, namely:
- symbol references: call _foo; jmp 15 <_foo+50>
- relocations: call _foo-_bar; call _foo-4
- __cf?string: leaq 193(%rip), %rax ## literal pool for "hello"
Stub support is the main missing part (because libObject doesn't know,
among other things, about mach-o indirect symbols).
As for the MCSymbolizer API, instead of relying on the disassemblers
to call the tryAdding* methods, maybe this could be done automagically
using InstrInfo? For instance, even though PC-relative LEAs are used
to get the address of string literals in a typical Mach-O file, a MOV
would be used in an ELF file. And right now, the explicit symbolization
only recognizes PC-relative LEAs. InstrInfo should have already have
most of what is needed to know what to symbolize, so this can
definitely be improved.
I'd also like to remove object::RelocationRef::getValueString (it seems
only used by relocation printing in objdump), as simply printing the
created MCExpr is definitely enough (and cleaner than string concats).
llvm-svn: 182625
2013-05-24 08:39:57 +08:00
|
|
|
|
2019-05-16 21:24:04 +08:00
|
|
|
ArrayRef<uint8_t> Bytes = arrayRefFromStringRef(
|
|
|
|
unwrapOrError(Section.getContents(), Obj->getFileName()));
|
2014-11-12 10:04:27 +08:00
|
|
|
|
2019-01-28 18:44:01 +08:00
|
|
|
uint64_t VMAAdjustment = 0;
|
|
|
|
if (shouldAdjustVA(Section))
|
|
|
|
VMAAdjustment = AdjustVMA;
|
|
|
|
|
2011-01-20 14:39:06 +08:00
|
|
|
uint64_t Size;
|
|
|
|
uint64_t Index;
|
2018-03-10 03:13:44 +08:00
|
|
|
bool PrintedSection = false;
|
2019-01-22 22:09:37 +08:00
|
|
|
std::vector<RelocationRef> Rels = RelocMap[Section];
|
2019-01-15 17:19:18 +08:00
|
|
|
std::vector<RelocationRef>::const_iterator RelCur = Rels.begin();
|
|
|
|
std::vector<RelocationRef>::const_iterator RelEnd = Rels.end();
|
2011-07-16 02:39:24 +08:00
|
|
|
// Disassemble symbol by symbol.
|
2019-01-15 17:19:18 +08:00
|
|
|
for (unsigned SI = 0, SE = Symbols.size(); SI != SE; ++SI) {
|
2020-02-11 08:23:01 +08:00
|
|
|
std::string SymbolName = Symbols[SI].Name.str();
|
2019-06-22 09:13:04 +08:00
|
|
|
if (Demangle)
|
|
|
|
SymbolName = demangle(SymbolName);
|
|
|
|
|
2020-03-08 04:55:44 +08:00
|
|
|
// Skip if --disassemble-symbols is not empty and the symbol is not in
|
2019-04-20 10:10:48 +08:00
|
|
|
// the list.
|
2020-03-08 04:55:44 +08:00
|
|
|
if (!DisasmSymbolSet.empty() && !DisasmSymbolSet.count(SymbolName))
|
2016-09-13 01:08:22 +08:00
|
|
|
continue;
|
2019-04-08 00:33:24 +08:00
|
|
|
|
2020-02-11 08:23:01 +08:00
|
|
|
uint64_t Start = Symbols[SI].Addr;
|
2019-04-20 15:19:24 +08:00
|
|
|
if (Start < SectionAddr || StopAddress <= Start)
|
|
|
|
continue;
|
2019-06-08 04:34:31 +08:00
|
|
|
else
|
2020-03-08 04:55:44 +08:00
|
|
|
FoundDisasmSymbolSet.insert(SymbolName);
|
2016-09-13 01:08:22 +08:00
|
|
|
|
2019-04-20 10:10:48 +08:00
|
|
|
// The end is the section end, the beginning of the next symbol, or
|
|
|
|
// --stop-address.
|
2019-04-20 15:48:41 +08:00
|
|
|
uint64_t End = std::min<uint64_t>(SectionAddr + SectSize, StopAddress);
|
|
|
|
if (SI + 1 < SE)
|
2020-02-11 08:23:01 +08:00
|
|
|
End = std::min(End, Symbols[SI + 1].Addr);
|
2019-04-20 15:19:24 +08:00
|
|
|
if (Start >= End || End <= StartAddress)
|
2018-03-10 03:13:44 +08:00
|
|
|
continue;
|
2019-04-20 10:10:48 +08:00
|
|
|
Start -= SectionAddr;
|
|
|
|
End -= SectionAddr;
|
2018-03-10 03:13:44 +08:00
|
|
|
|
|
|
|
if (!PrintedSection) {
|
|
|
|
PrintedSection = true;
|
2019-05-01 18:40:48 +08:00
|
|
|
outs() << "\nDisassembly of section ";
|
2018-03-10 03:13:44 +08:00
|
|
|
if (!SegmentName.empty())
|
|
|
|
outs() << SegmentName << ",";
|
2019-05-01 18:40:48 +08:00
|
|
|
outs() << SectionName << ":\n";
|
2018-03-10 03:13:44 +08:00
|
|
|
}
|
|
|
|
|
2018-12-19 18:21:45 +08:00
|
|
|
outs() << '\n';
|
2019-01-09 22:43:33 +08:00
|
|
|
if (!NoLeadingAddr)
|
2019-06-17 17:59:55 +08:00
|
|
|
outs() << format(Is64Bits ? "%016" PRIx64 " " : "%08" PRIx64 " ",
|
2019-01-28 18:44:01 +08:00
|
|
|
SectionAddr + Start + VMAAdjustment);
|
2020-04-06 22:09:12 +08:00
|
|
|
if (Obj->isXCOFF() && SymbolDescription) {
|
2020-04-22 05:52:08 +08:00
|
|
|
outs() << getXCOFFSymbolDescription(Symbols[SI], SymbolName) << ":\n";
|
2020-04-06 22:09:12 +08:00
|
|
|
} else
|
|
|
|
outs() << '<' << SymbolName << ">:\n";
|
2011-07-16 02:39:24 +08:00
|
|
|
|
2018-04-20 01:02:57 +08:00
|
|
|
// Don't print raw contents of a virtual section. A virtual section
|
|
|
|
// doesn't have any contents in the file.
|
|
|
|
if (Section.isVirtual()) {
|
|
|
|
outs() << "...\n";
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
[MC] Pass the symbol rather than its name to onSymbolStart()
Summary: This allows targets to also consider the symbol's type and/or address if needed.
Reviewers: scott.linder, jhenderson, MaskRay, aardappel
Reviewed By: scott.linder, MaskRay
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82090
2020-06-18 03:52:08 +08:00
|
|
|
auto Status = DisAsm->onSymbolStart(Symbols[SI], Size,
|
[MC] Changes to help improve target specific symbol disassembly
Summary:
This commit slightly modifies the MCDisassembler, and llvm-objdump to
allow targets to also decode entire symbols.
WebAssembly uses the onSymbolStart hook it to decode preludes.
WebAssembly partially disassembles the symbol in its target specific
way; and then falls back to the normal flow of llvm-objdump.
AMDGPU needs it to decode kernel descriptors entirely, and move to the
next symbol.
This commit is to split the above task into 2.
- Changes to llvm-objdump and MC-layer without breaking WebAssembly code
[ this commit ]
- AMDGPU's implementation of onSymbolStart that decodes kernel
descriptors. [ https://reviews.llvm.org/D80713 ]
Reviewers: scott.linder, t-tye, sunfish, arsenm, jhenderson, MaskRay, aardappel
Reviewed By: scott.linder, jhenderson, aardappel
Subscribers: bcain, dschuff, wdng, tpr, sbc100, jgravelle-google, hiraditya, aheejin, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80512
2020-06-13 02:00:33 +08:00
|
|
|
Bytes.slice(Start, End - Start),
|
|
|
|
SectionAddr + Start, CommentStream);
|
|
|
|
// To have round trippable disassembly, we fall back to decoding the
|
|
|
|
// remaining bytes as instructions.
|
|
|
|
//
|
|
|
|
// If there is a failure, we disassemble the failed region as bytes before
|
|
|
|
// falling back. The target is expected to print nothing in this case.
|
|
|
|
//
|
|
|
|
// If there is Success or SoftFail i.e no 'real' failure, we go ahead by
|
|
|
|
// Size bytes before falling back.
|
|
|
|
// So if the entire symbol is 'eaten' by the target:
|
|
|
|
// Start += Size // Now Start = End and we will never decode as
|
|
|
|
// // instructions
|
|
|
|
//
|
|
|
|
// Right now, most targets return None i.e ignore to treat a symbol
|
|
|
|
// separately. But WebAssembly decodes preludes for some symbols.
|
|
|
|
//
|
|
|
|
if (Status.hasValue()) {
|
|
|
|
if (Status.getValue() == MCDisassembler::Fail) {
|
|
|
|
outs() << "// Error in decoding " << SymbolName
|
|
|
|
<< " : Decoding failed region as bytes.\n";
|
|
|
|
for (uint64_t I = 0; I < Size; ++I) {
|
|
|
|
outs() << "\t.byte\t " << format_hex(Bytes[I], 1, /*Upper=*/true)
|
|
|
|
<< "\n";
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
Size = 0;
|
|
|
|
}
|
|
|
|
|
2019-01-18 02:14:09 +08:00
|
|
|
Start += Size;
|
|
|
|
|
2019-04-08 00:33:24 +08:00
|
|
|
Index = Start;
|
|
|
|
if (SectionAddr < StartAddress)
|
|
|
|
Index = std::max<uint64_t>(Index, StartAddress - SectionAddr);
|
|
|
|
|
2019-06-25 01:47:56 +08:00
|
|
|
// If there is a data/common symbol inside an ELF text section and we are
|
2019-04-08 00:33:24 +08:00
|
|
|
// only disassembling text (applicable all architectures), we are in a
|
|
|
|
// situation where we must print the data and not disassemble it.
|
2019-06-25 01:47:56 +08:00
|
|
|
if (Obj->isELF() && !DisassembleAll && Section.isText()) {
|
2020-02-11 08:23:01 +08:00
|
|
|
uint8_t SymTy = Symbols[SI].Type;
|
2019-06-25 01:47:56 +08:00
|
|
|
if (SymTy == ELF::STT_OBJECT || SymTy == ELF::STT_COMMON) {
|
|
|
|
dumpELFData(SectionAddr, Index, End, Bytes);
|
|
|
|
Index = End;
|
|
|
|
}
|
2019-04-08 00:33:24 +08:00
|
|
|
}
|
2011-09-20 01:56:04 +08:00
|
|
|
|
2019-06-20 08:29:40 +08:00
|
|
|
bool CheckARMELFData = hasMappingSymbols(Obj) &&
|
2020-02-11 08:23:01 +08:00
|
|
|
Symbols[SI].Type != ELF::STT_OBJECT &&
|
2019-04-08 00:33:24 +08:00
|
|
|
!DisassembleAll;
|
2020-05-02 13:52:42 +08:00
|
|
|
bool DumpARMELFData = false;
|
2020-03-17 22:21:42 +08:00
|
|
|
formatted_raw_ostream FOS(outs());
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
|
|
|
|
std::unordered_map<uint64_t, std::string> AllLabels;
|
|
|
|
if (SymbolizeOperands)
|
|
|
|
collectLocalBranchTargets(Bytes, MIA, DisAsm, IP, PrimarySTI,
|
|
|
|
SectionAddr, Index, End, AllLabels);
|
|
|
|
|
2019-04-08 00:33:24 +08:00
|
|
|
while (Index < End) {
|
2019-06-20 08:29:40 +08:00
|
|
|
// ARM and AArch64 ELF binaries can interleave data and text in the
|
|
|
|
// same section. We rely on the markers introduced to understand what
|
|
|
|
// we need to dump. If the data marker is within a function, it is
|
2019-04-08 00:33:24 +08:00
|
|
|
// denoted as a word/short etc.
|
2020-05-02 13:52:42 +08:00
|
|
|
if (CheckARMELFData) {
|
|
|
|
char Kind = getMappingSymbolKind(MappingSymbols, Index);
|
|
|
|
DumpARMELFData = Kind == 'd';
|
|
|
|
if (SecondarySTI) {
|
|
|
|
if (Kind == 'a') {
|
|
|
|
STI = PrimaryIsThumb ? SecondarySTI : PrimarySTI;
|
|
|
|
DisAsm = PrimaryIsThumb ? SecondaryDisAsm : PrimaryDisAsm;
|
|
|
|
} else if (Kind == 't') {
|
|
|
|
STI = PrimaryIsThumb ? PrimarySTI : SecondarySTI;
|
|
|
|
DisAsm = PrimaryIsThumb ? PrimaryDisAsm : SecondaryDisAsm;
|
|
|
|
}
|
2019-02-19 20:38:36 +08:00
|
|
|
}
|
2019-01-10 22:55:26 +08:00
|
|
|
}
|
|
|
|
|
2020-05-02 13:52:42 +08:00
|
|
|
if (DumpARMELFData) {
|
|
|
|
Size = dumpARMELFData(SectionAddr, Index, End, Obj, Bytes,
|
2020-03-17 22:21:42 +08:00
|
|
|
MappingSymbols, FOS);
|
2020-05-02 13:52:42 +08:00
|
|
|
} else {
|
|
|
|
// When -z or --disassemble-zeroes are given we always dissasemble
|
|
|
|
// them. Otherwise we might want to skip zero bytes we see.
|
|
|
|
if (!DisassembleZeroes) {
|
|
|
|
uint64_t MaxOffset = End - Index;
|
|
|
|
// For --reloc: print zero blocks patched by relocations, so that
|
|
|
|
// relocations can be shown in the dump.
|
|
|
|
if (RelCur != RelEnd)
|
|
|
|
MaxOffset = RelCur->getOffset() - Index;
|
|
|
|
|
|
|
|
if (size_t N =
|
|
|
|
countSkippableZeroBytes(Bytes.slice(Index, MaxOffset))) {
|
2020-03-17 22:21:42 +08:00
|
|
|
FOS << "\t\t..." << '\n';
|
2020-05-02 13:52:42 +08:00
|
|
|
Index += N;
|
|
|
|
continue;
|
|
|
|
}
|
2019-06-20 08:29:40 +08:00
|
|
|
}
|
|
|
|
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
// Print local label if there's any.
|
|
|
|
auto Iter = AllLabels.find(SectionAddr + Index);
|
|
|
|
if (Iter != AllLabels.end())
|
|
|
|
FOS << "<" << Iter->second << ">:\n";
|
|
|
|
|
2020-05-02 13:52:42 +08:00
|
|
|
// Disassemble a real instruction or a data when disassemble all is
|
|
|
|
// provided
|
|
|
|
MCInst Inst;
|
|
|
|
bool Disassembled =
|
|
|
|
DisAsm->getInstruction(Inst, Size, Bytes.slice(Index),
|
|
|
|
SectionAddr + Index, CommentStream);
|
|
|
|
if (Size == 0)
|
|
|
|
Size = 1;
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
LVP.update({Index, Section.getIndex()},
|
|
|
|
{Index + Size, Section.getIndex()}, Index + Size != End);
|
|
|
|
|
2020-05-02 13:52:42 +08:00
|
|
|
PIP.printInst(
|
|
|
|
*IP, Disassembled ? &Inst : nullptr, Bytes.slice(Index, Size),
|
2020-03-17 22:21:42 +08:00
|
|
|
{SectionAddr + Index + VMAAdjustment, Section.getIndex()}, FOS,
|
|
|
|
"", *STI, &SP, Obj->getFileName(), &Rels, LVP);
|
|
|
|
FOS << CommentStream.str();
|
2020-05-02 13:52:42 +08:00
|
|
|
Comments.clear();
|
|
|
|
|
|
|
|
// If disassembly has failed, avoid analysing invalid/incomplete
|
|
|
|
// instruction information. Otherwise, try to resolve the target
|
|
|
|
// address (jump target or memory operand address) and print it on the
|
|
|
|
// right of the instruction.
|
|
|
|
if (Disassembled && MIA) {
|
|
|
|
uint64_t Target;
|
|
|
|
bool PrintTarget =
|
|
|
|
MIA->evaluateBranch(Inst, SectionAddr + Index, Size, Target);
|
|
|
|
if (!PrintTarget)
|
|
|
|
if (Optional<uint64_t> MaybeTarget =
|
|
|
|
MIA->evaluateMemoryOperandAddress(
|
|
|
|
Inst, SectionAddr + Index, Size)) {
|
|
|
|
Target = *MaybeTarget;
|
|
|
|
PrintTarget = true;
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
// Do not print real address when symbolizing.
|
|
|
|
if (!SymbolizeOperands)
|
|
|
|
FOS << " # " << Twine::utohexstr(Target);
|
2015-11-18 10:49:19 +08:00
|
|
|
}
|
2020-05-02 13:52:42 +08:00
|
|
|
if (PrintTarget) {
|
|
|
|
// In a relocatable object, the target's section must reside in
|
|
|
|
// the same section as the call instruction or it is accessed
|
|
|
|
// through a relocation.
|
|
|
|
//
|
|
|
|
// In a non-relocatable object, the target may be in any section.
|
|
|
|
// In that case, locate the section(s) containing the target
|
|
|
|
// address and find the symbol in one of those, if possible.
|
|
|
|
//
|
|
|
|
// N.B. We don't walk the relocations in the relocatable case yet.
|
|
|
|
std::vector<const SectionSymbolsTy *> TargetSectionSymbols;
|
|
|
|
if (!Obj->isRelocatableObject()) {
|
|
|
|
auto It = llvm::partition_point(
|
|
|
|
SectionAddresses,
|
|
|
|
[=](const std::pair<uint64_t, SectionRef> &O) {
|
|
|
|
return O.first <= Target;
|
|
|
|
});
|
|
|
|
uint64_t TargetSecAddr = 0;
|
|
|
|
while (It != SectionAddresses.begin()) {
|
|
|
|
--It;
|
|
|
|
if (TargetSecAddr == 0)
|
|
|
|
TargetSecAddr = It->first;
|
|
|
|
if (It->first != TargetSecAddr)
|
|
|
|
break;
|
|
|
|
TargetSectionSymbols.push_back(&AllSymbols[It->second]);
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
TargetSectionSymbols.push_back(&Symbols);
|
|
|
|
}
|
|
|
|
TargetSectionSymbols.push_back(&AbsoluteSymbols);
|
|
|
|
|
|
|
|
// Find the last symbol in the first candidate section whose
|
|
|
|
// offset is less than or equal to the target. If there are no
|
|
|
|
// such symbols, try in the next section and so on, before finally
|
|
|
|
// using the nearest preceding absolute symbol (if any), if there
|
|
|
|
// are no other valid symbols.
|
|
|
|
const SymbolInfoTy *TargetSym = nullptr;
|
|
|
|
for (const SectionSymbolsTy *TargetSymbols :
|
|
|
|
TargetSectionSymbols) {
|
|
|
|
auto It = llvm::partition_point(
|
|
|
|
*TargetSymbols,
|
|
|
|
[=](const SymbolInfoTy &O) { return O.Addr <= Target; });
|
|
|
|
if (It != TargetSymbols->begin()) {
|
|
|
|
TargetSym = &*(It - 1);
|
|
|
|
break;
|
|
|
|
}
|
2020-04-20 21:23:01 +08:00
|
|
|
}
|
|
|
|
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
// Print the labels corresponding to the target if there's any.
|
|
|
|
bool LabelAvailable = AllLabels.count(Target);
|
2020-05-02 13:52:42 +08:00
|
|
|
if (TargetSym != nullptr) {
|
|
|
|
uint64_t TargetAddress = TargetSym->Addr;
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
uint64_t Disp = Target - TargetAddress;
|
2020-05-02 13:52:42 +08:00
|
|
|
std::string TargetName = TargetSym->Name.str();
|
|
|
|
if (Demangle)
|
|
|
|
TargetName = demangle(TargetName);
|
|
|
|
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
FOS << " <";
|
|
|
|
if (!Disp) {
|
|
|
|
// Always Print the binary symbol precisely corresponding to
|
|
|
|
// the target address.
|
|
|
|
FOS << TargetName;
|
|
|
|
} else if (!LabelAvailable) {
|
|
|
|
// Always Print the binary symbol plus an offset if there's no
|
|
|
|
// local label corresponding to the target address.
|
|
|
|
FOS << TargetName << "+0x" << Twine::utohexstr(Disp);
|
|
|
|
} else {
|
|
|
|
FOS << AllLabels[Target];
|
|
|
|
}
|
|
|
|
FOS << ">";
|
|
|
|
} else if (LabelAvailable) {
|
|
|
|
FOS << " <" << AllLabels[Target] << ">";
|
2020-05-02 13:52:42 +08:00
|
|
|
}
|
2015-07-08 06:06:59 +08:00
|
|
|
}
|
|
|
|
}
|
2011-07-21 03:37:35 +08:00
|
|
|
}
|
2020-03-17 22:21:42 +08:00
|
|
|
|
|
|
|
LVP.printAfterInst(FOS);
|
|
|
|
FOS << "\n";
|
2011-10-14 06:17:18 +08:00
|
|
|
|
2018-05-15 03:46:08 +08:00
|
|
|
// Hexagon does this in pretty printer
|
2019-01-23 18:52:38 +08:00
|
|
|
if (Obj->getArch() != Triple::hexagon) {
|
2020-04-13 23:49:36 +08:00
|
|
|
// Print relocation for instruction and data.
|
2019-01-15 17:19:18 +08:00
|
|
|
while (RelCur != RelEnd) {
|
2019-01-23 21:39:12 +08:00
|
|
|
uint64_t Offset = RelCur->getOffset();
|
2018-05-15 03:46:08 +08:00
|
|
|
// If this relocation is hidden, skip it.
|
2019-04-08 00:33:24 +08:00
|
|
|
if (getHidden(*RelCur) || SectionAddr + Offset < StartAddress) {
|
2019-01-15 17:19:18 +08:00
|
|
|
++RelCur;
|
2018-05-15 03:46:08 +08:00
|
|
|
continue;
|
|
|
|
}
|
2011-10-26 04:35:53 +08:00
|
|
|
|
2020-04-13 23:49:36 +08:00
|
|
|
// Stop when RelCur's offset is past the disassembled
|
|
|
|
// instruction/data. Note that it's possible the disassembled data
|
|
|
|
// is not the complete data: we might see the relocation printed in
|
|
|
|
// the middle of the data, but this matches the binutils objdump
|
|
|
|
// output.
|
2019-01-23 21:39:12 +08:00
|
|
|
if (Offset >= Index + Size)
|
2019-01-15 17:19:18 +08:00
|
|
|
break;
|
2019-01-23 21:39:12 +08:00
|
|
|
|
2019-01-28 18:44:01 +08:00
|
|
|
// When --adjust-vma is used, update the address printed.
|
|
|
|
if (RelCur->getSymbol() != Obj->symbol_end()) {
|
|
|
|
Expected<section_iterator> SymSI =
|
|
|
|
RelCur->getSymbol()->getSection();
|
|
|
|
if (SymSI && *SymSI != Obj->section_end() &&
|
2019-04-08 00:33:24 +08:00
|
|
|
shouldAdjustVA(**SymSI))
|
2019-01-28 18:44:01 +08:00
|
|
|
Offset += AdjustVMA;
|
|
|
|
}
|
|
|
|
|
2020-03-17 22:21:42 +08:00
|
|
|
printRelocation(FOS, Obj->getFileName(), *RelCur,
|
|
|
|
SectionAddr + Offset, Is64Bits);
|
|
|
|
LVP.printAfterOtherLine(FOS, true);
|
2019-01-15 17:19:18 +08:00
|
|
|
++RelCur;
|
2016-09-13 01:08:22 +08:00
|
|
|
}
|
2019-01-23 18:52:38 +08:00
|
|
|
}
|
2019-04-08 00:33:24 +08:00
|
|
|
|
|
|
|
Index += Size;
|
2011-07-21 03:37:35 +08:00
|
|
|
}
|
2011-01-20 14:39:06 +08:00
|
|
|
}
|
|
|
|
}
|
2020-03-08 04:55:44 +08:00
|
|
|
StringSet<> MissingDisasmSymbolSet =
|
|
|
|
set_difference(DisasmSymbolSet, FoundDisasmSymbolSet);
|
|
|
|
for (StringRef Sym : MissingDisasmSymbolSet.keys())
|
|
|
|
reportWarning("failed to disassemble missing symbol " + Sym, FileName);
|
2011-01-20 14:39:06 +08:00
|
|
|
}
|
|
|
|
|
2019-01-23 18:33:26 +08:00
|
|
|
static void disassembleObject(const ObjectFile *Obj, bool InlineRelocs) {
|
|
|
|
const Target *TheTarget = getTarget(Obj);
|
|
|
|
|
|
|
|
// Package up features to be passed to target/subtarget
|
|
|
|
SubtargetFeatures Features = Obj->getFeatures();
|
|
|
|
if (!MAttrs.empty())
|
|
|
|
for (unsigned I = 0; I != MAttrs.size(); ++I)
|
|
|
|
Features.AddFeature(MAttrs[I]);
|
|
|
|
|
|
|
|
std::unique_ptr<const MCRegisterInfo> MRI(
|
|
|
|
TheTarget->createMCRegInfo(TripleName));
|
|
|
|
if (!MRI)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"no register info for target " + TripleName);
|
2019-01-23 18:33:26 +08:00
|
|
|
|
|
|
|
// Set up disassembler.
|
2019-10-23 18:24:35 +08:00
|
|
|
MCTargetOptions MCOptions;
|
2019-01-23 18:33:26 +08:00
|
|
|
std::unique_ptr<const MCAsmInfo> AsmInfo(
|
2019-10-23 18:24:35 +08:00
|
|
|
TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));
|
2019-01-23 18:33:26 +08:00
|
|
|
if (!AsmInfo)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"no assembly info for target " + TripleName);
|
2020-07-24 17:51:46 +08:00
|
|
|
|
|
|
|
if (MCPU.empty())
|
|
|
|
MCPU = Obj->tryGetCPUName().getValueOr("").str();
|
|
|
|
|
2019-01-23 18:33:26 +08:00
|
|
|
std::unique_ptr<const MCSubtargetInfo> STI(
|
|
|
|
TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));
|
|
|
|
if (!STI)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"no subtarget info for target " + TripleName);
|
2019-01-23 18:33:26 +08:00
|
|
|
std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());
|
|
|
|
if (!MII)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"no instruction info for target " + TripleName);
|
2019-01-23 18:33:26 +08:00
|
|
|
MCObjectFileInfo MOFI;
|
|
|
|
MCContext Ctx(AsmInfo.get(), MRI.get(), &MOFI);
|
|
|
|
// FIXME: for now initialize MCObjectFileInfo with default values
|
|
|
|
MOFI.InitMCObjectFileInfo(Triple(TripleName), false, Ctx);
|
|
|
|
|
|
|
|
std::unique_ptr<MCDisassembler> DisAsm(
|
|
|
|
TheTarget->createMCDisassembler(*STI, Ctx));
|
|
|
|
if (!DisAsm)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(), "no disassembler for target " + TripleName);
|
2019-01-23 18:33:26 +08:00
|
|
|
|
2019-06-20 08:29:40 +08:00
|
|
|
// If we have an ARM object file, we need a second disassembler, because
|
|
|
|
// ARM CPUs have two different instruction sets: ARM mode, and Thumb mode.
|
|
|
|
// We use mapping symbols to switch between the two assemblers, where
|
|
|
|
// appropriate.
|
|
|
|
std::unique_ptr<MCDisassembler> SecondaryDisAsm;
|
|
|
|
std::unique_ptr<const MCSubtargetInfo> SecondarySTI;
|
|
|
|
if (isArmElf(Obj) && !STI->checkFeatures("+mclass")) {
|
|
|
|
if (STI->checkFeatures("+thumb-mode"))
|
|
|
|
Features.AddFeature("-thumb-mode");
|
|
|
|
else
|
|
|
|
Features.AddFeature("+thumb-mode");
|
|
|
|
SecondarySTI.reset(TheTarget->createMCSubtargetInfo(TripleName, MCPU,
|
|
|
|
Features.getString()));
|
|
|
|
SecondaryDisAsm.reset(TheTarget->createMCDisassembler(*SecondarySTI, Ctx));
|
|
|
|
}
|
|
|
|
|
2019-01-23 18:33:26 +08:00
|
|
|
std::unique_ptr<const MCInstrAnalysis> MIA(
|
|
|
|
TheTarget->createMCInstrAnalysis(MII.get()));
|
|
|
|
|
|
|
|
int AsmPrinterVariant = AsmInfo->getAssemblerDialect();
|
|
|
|
std::unique_ptr<MCInstPrinter> IP(TheTarget->createMCInstPrinter(
|
|
|
|
Triple(TripleName), AsmPrinterVariant, *AsmInfo, *MII, *MRI));
|
|
|
|
if (!IP)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"no instruction printer for target " + TripleName);
|
2019-01-23 18:33:26 +08:00
|
|
|
IP->setPrintImmHex(PrintImmHex);
|
[MCInstPrinter] Pass `Address` parameter to MCOI::OPERAND_PCREL typed operands. NFC
Follow-up of D72172 and D72180
This patch passes `uint64_t Address` to print methods of PC-relative
operands so that subsequent target specific patches can change
`*InstPrinter::print{Operand,PCRelImm,...}` to customize the output.
Add MCInstPrinter::PrintBranchImmAsAddress which is set to true by
llvm-objdump.
```
// Current llvm-objdump -d output
aarch64: 20000: bl #0
ppc: 20000: bl .+4
x86: 20000: callq 0
// Ideal output
aarch64: 20000: bl 0x20000
ppc: 20000: bl 0x20004
x86: 20000: callq 0x20005
// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
aarch64: 20000: bl 20000
ppc: 20000: bl 0x20004
x86: 20000: callq 20005
```
In `lib/Target/X86/X86GenAsmWriter1.inc` (generated by `llvm-tblgen -gen-asm-writer`):
```
case 12:
// CALL64pcrel32, CALLpcrel16, CALLpcrel32, EH_SjLj_Setup, JCXZ, JECXZ, J...
- printPCRelImm(MI, 0, O);
+ printPCRelImm(MI, Address, 0, O);
return;
```
Some targets have 2 `printOperand` overloads, one without `Address` and
one with `Address`. They should annotate derived `Operand` properly with
`let OperandType = "OPERAND_PCREL"`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76574
2020-03-23 03:32:27 +08:00
|
|
|
IP->setPrintBranchImmAsAddress(true);
|
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff.
When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise.
In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand.
So far only the X86 disassemblers are supported.
Test Plan:
llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
mov eax, dword ptr [rsp]
cmp eax, dword ptr [rip + 4112] # 202182 <g>
jge 0x20117e <_start+0x25>
call 0x201158 <foo>
inc dword ptr [rsp]
jmp 0x201169 <_start+0x10>
xor eax, eax
pop rcx
ret
```
llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr :
```
Disassembly of section .text:
<_start>:
push rax
mov dword ptr [rsp + 4], 0
mov dword ptr [rsp], 0
<L1>:
mov eax, dword ptr [rsp]
cmp eax, dword ptr <g>
jge <L0>
call <foo>
inc dword ptr [rsp]
jmp <L1>
<L0>:
xor eax, eax
pop rcx
ret
```
Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
|
|
|
IP->setSymbolizeOperands(SymbolizeOperands);
|
|
|
|
IP->setMCInstrAnalysis(MIA.get());
|
2019-01-23 18:33:26 +08:00
|
|
|
|
|
|
|
PrettyPrinter &PIP = selectPrettyPrinter(Triple(TripleName));
|
|
|
|
SourcePrinter SP(Obj, TheTarget->getName());
|
|
|
|
|
2019-02-26 20:15:14 +08:00
|
|
|
for (StringRef Opt : DisassemblerOptions)
|
|
|
|
if (!IP->applyTargetSpecificCLOption(Opt))
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"Unrecognized disassembler option: " + Opt);
|
2019-02-26 20:15:14 +08:00
|
|
|
|
2019-06-20 08:29:40 +08:00
|
|
|
disassembleObject(TheTarget, Obj, Ctx, DisAsm.get(), SecondaryDisAsm.get(),
|
|
|
|
MIA.get(), IP.get(), STI.get(), SecondarySTI.get(), PIP,
|
|
|
|
SP, InlineRelocs);
|
2019-01-23 18:33:26 +08:00
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
void objdump::printRelocations(const ObjectFile *Obj) {
|
2014-03-21 06:55:15 +08:00
|
|
|
StringRef Fmt = Obj->getBytesInAddress() > 4 ? "%016" PRIx64 :
|
|
|
|
"%08" PRIx64;
|
2016-03-22 04:59:15 +08:00
|
|
|
// Regular objdump doesn't print relocations in non-relocatable object
|
|
|
|
// files.
|
|
|
|
if (!Obj->isRelocatableObject())
|
|
|
|
return;
|
2014-08-18 03:09:37 +08:00
|
|
|
|
2019-05-07 21:14:18 +08:00
|
|
|
// Build a mapping from relocation target to a vector of relocation
|
|
|
|
// sections. Usually, there is an only one relocation section for
|
|
|
|
// each relocated section.
|
|
|
|
MapVector<SectionRef, std::vector<SectionRef>> SecToRelSec;
|
2019-10-21 19:06:38 +08:00
|
|
|
uint64_t Ndx;
|
|
|
|
for (const SectionRef &Section : ToolSectionFilter(*Obj, &Ndx)) {
|
2014-03-13 22:37:36 +08:00
|
|
|
if (Section.relocation_begin() == Section.relocation_end())
|
2011-10-08 08:18:30 +08:00
|
|
|
continue;
|
2019-10-21 19:06:38 +08:00
|
|
|
Expected<section_iterator> SecOrErr = Section.getRelocatedSection();
|
|
|
|
if (!SecOrErr)
|
|
|
|
reportError(Obj->getFileName(),
|
|
|
|
"section (" + Twine(Ndx) +
|
|
|
|
"): unable to get a relocation target: " +
|
|
|
|
toString(SecOrErr.takeError()));
|
|
|
|
SecToRelSec[**SecOrErr].push_back(Section);
|
2019-05-07 21:14:18 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
for (std::pair<SectionRef, std::vector<SectionRef>> &P : SecToRelSec) {
|
2019-08-14 19:10:11 +08:00
|
|
|
StringRef SecName = unwrapOrError(P.first.getName(), Obj->getFileName());
|
2019-01-15 17:19:18 +08:00
|
|
|
outs() << "RELOCATION RECORDS FOR [" << SecName << "]:\n";
|
2020-02-11 19:10:57 +08:00
|
|
|
uint32_t OffsetPadding = (Obj->getBytesInAddress() > 4 ? 16 : 8);
|
|
|
|
uint32_t TypePadding = 24;
|
|
|
|
outs() << left_justify("OFFSET", OffsetPadding) << " "
|
|
|
|
<< left_justify("TYPE", TypePadding) << " "
|
|
|
|
<< "VALUE\n";
|
2019-05-07 21:14:18 +08:00
|
|
|
|
|
|
|
for (SectionRef Section : P.second) {
|
|
|
|
for (const RelocationRef &Reloc : Section.relocations()) {
|
|
|
|
uint64_t Address = Reloc.getOffset();
|
|
|
|
SmallString<32> RelocName;
|
|
|
|
SmallString<32> ValueStr;
|
|
|
|
if (Address < StartAddress || Address > StopAddress || getHidden(Reloc))
|
|
|
|
continue;
|
|
|
|
Reloc.getTypeName(RelocName);
|
2019-08-21 19:07:31 +08:00
|
|
|
if (Error E = getRelocationValueString(Reloc, ValueStr))
|
|
|
|
reportError(std::move(E), Obj->getFileName());
|
|
|
|
|
2020-02-11 19:10:57 +08:00
|
|
|
outs() << format(Fmt.data(), Address) << " "
|
|
|
|
<< left_justify(RelocName, TypePadding) << " " << ValueStr
|
|
|
|
<< "\n";
|
2019-05-07 21:14:18 +08:00
|
|
|
}
|
2011-10-08 08:18:30 +08:00
|
|
|
}
|
|
|
|
outs() << "\n";
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
void objdump::printDynamicRelocations(const ObjectFile *Obj) {
|
2018-06-07 21:30:55 +08:00
|
|
|
// For the moment, this option is for ELF only
|
|
|
|
if (!Obj->isELF())
|
|
|
|
return;
|
|
|
|
|
|
|
|
const auto *Elf = dyn_cast<ELFObjectFileBase>(Obj);
|
|
|
|
if (!Elf || Elf->getEType() != ELF::ET_DYN) {
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(Obj->getFileName(), "not a dynamic object");
|
2018-06-07 21:30:55 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
std::vector<SectionRef> DynRelSec = Obj->dynamic_relocation_sections();
|
|
|
|
if (DynRelSec.empty())
|
|
|
|
return;
|
|
|
|
|
|
|
|
outs() << "DYNAMIC RELOCATION RECORDS\n";
|
2019-01-15 17:19:18 +08:00
|
|
|
StringRef Fmt = Obj->getBytesInAddress() > 4 ? "%016" PRIx64 : "%08" PRIx64;
|
2019-04-24 23:09:23 +08:00
|
|
|
for (const SectionRef &Section : DynRelSec)
|
2018-06-07 21:30:55 +08:00
|
|
|
for (const RelocationRef &Reloc : Section.relocations()) {
|
2019-01-15 17:19:18 +08:00
|
|
|
uint64_t Address = Reloc.getOffset();
|
|
|
|
SmallString<32> RelocName;
|
|
|
|
SmallString<32> ValueStr;
|
|
|
|
Reloc.getTypeName(RelocName);
|
2019-08-21 19:07:31 +08:00
|
|
|
if (Error E = getRelocationValueString(Reloc, ValueStr))
|
|
|
|
reportError(std::move(E), Obj->getFileName());
|
2019-01-15 17:19:18 +08:00
|
|
|
outs() << format(Fmt.data(), Address) << " " << RelocName << " "
|
|
|
|
<< ValueStr << "\n";
|
2018-06-07 21:30:55 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2019-01-28 22:11:35 +08:00
|
|
|
// Returns true if we need to show LMA column when dumping section headers. We
|
|
|
|
// show it only when the platform is ELF and either we have at least one section
|
|
|
|
// whose VMA and LMA are different and/or when --show-lma flag is used.
|
|
|
|
static bool shouldDisplayLMA(const ObjectFile *Obj) {
|
|
|
|
if (!Obj->isELF())
|
|
|
|
return false;
|
|
|
|
for (const SectionRef &S : ToolSectionFilter(*Obj))
|
|
|
|
if (S.getAddress() != getELFSectionLMA(S))
|
|
|
|
return true;
|
|
|
|
return ShowLMA;
|
|
|
|
}
|
|
|
|
|
2019-10-15 01:47:17 +08:00
|
|
|
static size_t getMaxSectionNameWidth(const ObjectFile *Obj) {
|
|
|
|
// Default column width for names is 13 even if no names are that long.
|
|
|
|
size_t MaxWidth = 13;
|
|
|
|
for (const SectionRef &Section : ToolSectionFilter(*Obj)) {
|
|
|
|
StringRef Name = unwrapOrError(Section.getName(), Obj->getFileName());
|
|
|
|
MaxWidth = std::max(MaxWidth, Name.size());
|
|
|
|
}
|
|
|
|
return MaxWidth;
|
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
void objdump::printSectionHeaders(const ObjectFile *Obj) {
|
2019-10-15 01:47:17 +08:00
|
|
|
size_t NameWidth = getMaxSectionNameWidth(Obj);
|
|
|
|
size_t AddressWidth = 2 * Obj->getBytesInAddress();
|
2019-01-28 22:11:35 +08:00
|
|
|
bool HasLMAColumn = shouldDisplayLMA(Obj);
|
|
|
|
if (HasLMAColumn)
|
|
|
|
outs() << "Sections:\n"
|
2019-10-15 01:47:17 +08:00
|
|
|
"Idx "
|
|
|
|
<< left_justify("Name", NameWidth) << " Size "
|
|
|
|
<< left_justify("VMA", AddressWidth) << " "
|
|
|
|
<< left_justify("LMA", AddressWidth) << " Type\n";
|
2019-01-28 22:11:35 +08:00
|
|
|
else
|
|
|
|
outs() << "Sections:\n"
|
2019-10-15 01:47:17 +08:00
|
|
|
"Idx "
|
|
|
|
<< left_justify("Name", NameWidth) << " Size "
|
|
|
|
<< left_justify("VMA", AddressWidth) << " Type\n";
|
2019-01-28 22:11:35 +08:00
|
|
|
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
uint64_t Idx;
|
|
|
|
for (const SectionRef &Section : ToolSectionFilter(*Obj, &Idx)) {
|
2019-08-14 19:10:11 +08:00
|
|
|
StringRef Name = unwrapOrError(Section.getName(), Obj->getFileName());
|
2019-01-28 22:11:35 +08:00
|
|
|
uint64_t VMA = Section.getAddress();
|
2019-01-29 00:36:12 +08:00
|
|
|
if (shouldAdjustVA(Section))
|
|
|
|
VMA += AdjustVMA;
|
|
|
|
|
2014-10-08 23:28:58 +08:00
|
|
|
uint64_t Size = Section.getSize();
|
2019-10-15 01:47:17 +08:00
|
|
|
|
|
|
|
std::string Type = Section.isText() ? "TEXT" : "";
|
|
|
|
if (Section.isData())
|
|
|
|
Type += Type.empty() ? "DATA" : " DATA";
|
|
|
|
if (Section.isBSS())
|
|
|
|
Type += Type.empty() ? "BSS" : " BSS";
|
2019-01-28 22:11:35 +08:00
|
|
|
|
|
|
|
if (HasLMAColumn)
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
outs() << format("%3" PRIu64 " %-*s %08" PRIx64 " ", Idx, NameWidth,
|
|
|
|
Name.str().c_str(), Size)
|
2019-10-15 01:47:17 +08:00
|
|
|
<< format_hex_no_prefix(VMA, AddressWidth) << " "
|
|
|
|
<< format_hex_no_prefix(getELFSectionLMA(Section), AddressWidth)
|
|
|
|
<< " " << Type << "\n";
|
2019-01-28 22:11:35 +08:00
|
|
|
else
|
Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.
This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.
Original description:
When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).
While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.
Reviewers: grimar, jhenderson, espindola
Reviewed By: grimar
Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68848
llvm-svn: 375178
2019-10-18 05:55:43 +08:00
|
|
|
outs() << format("%3" PRIu64 " %-*s %08" PRIx64 " ", Idx, NameWidth,
|
|
|
|
Name.str().c_str(), Size)
|
2019-10-15 01:47:17 +08:00
|
|
|
<< format_hex_no_prefix(VMA, AddressWidth) << " " << Type << "\n";
|
2011-10-11 05:21:34 +08:00
|
|
|
}
|
2018-11-17 16:12:48 +08:00
|
|
|
outs() << "\n";
|
2011-10-11 05:21:34 +08:00
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
void objdump::printSectionContents(const ObjectFile *Obj) {
|
2020-09-04 08:07:59 +08:00
|
|
|
const MachOObjectFile *MachO = dyn_cast<const MachOObjectFile>(Obj);
|
|
|
|
|
2015-07-29 23:45:39 +08:00
|
|
|
for (const SectionRef &Section : ToolSectionFilter(*Obj)) {
|
2019-08-14 19:10:11 +08:00
|
|
|
StringRef Name = unwrapOrError(Section.getName(), Obj->getFileName());
|
2014-10-08 23:28:58 +08:00
|
|
|
uint64_t BaseAddr = Section.getAddress();
|
2014-11-11 17:58:25 +08:00
|
|
|
uint64_t Size = Section.getSize();
|
|
|
|
if (!Size)
|
|
|
|
continue;
|
2011-10-18 01:13:22 +08:00
|
|
|
|
2020-09-04 08:07:59 +08:00
|
|
|
outs() << "Contents of section ";
|
|
|
|
StringRef SegmentName = getSegmentName(MachO, Section);
|
|
|
|
if (!SegmentName.empty())
|
|
|
|
outs() << SegmentName << ",";
|
|
|
|
outs() << Name << ":\n";
|
2014-11-11 17:58:25 +08:00
|
|
|
if (Section.isBSS()) {
|
2013-04-16 18:53:11 +08:00
|
|
|
outs() << format("<skipping contents of bss section at [%04" PRIx64
|
2014-07-15 00:20:14 +08:00
|
|
|
", %04" PRIx64 ")>\n",
|
|
|
|
BaseAddr, BaseAddr + Size);
|
2013-04-16 18:53:11 +08:00
|
|
|
continue;
|
|
|
|
}
|
2011-10-18 01:13:22 +08:00
|
|
|
|
2019-05-16 21:24:04 +08:00
|
|
|
StringRef Contents = unwrapOrError(Section.getContents(), Obj->getFileName());
|
2014-07-15 00:20:14 +08:00
|
|
|
|
2011-10-18 01:13:22 +08:00
|
|
|
// Dump out the content as hex and printable ascii characters.
|
2019-01-15 17:19:18 +08:00
|
|
|
for (std::size_t Addr = 0, End = Contents.size(); Addr < End; Addr += 16) {
|
|
|
|
outs() << format(" %04" PRIx64 " ", BaseAddr + Addr);
|
2011-10-18 01:13:22 +08:00
|
|
|
// Dump line of hex.
|
2019-01-15 17:19:18 +08:00
|
|
|
for (std::size_t I = 0; I < 16; ++I) {
|
|
|
|
if (I != 0 && I % 4 == 0)
|
2011-10-18 01:13:22 +08:00
|
|
|
outs() << ' ';
|
2019-01-15 17:19:18 +08:00
|
|
|
if (Addr + I < End)
|
|
|
|
outs() << hexdigit((Contents[Addr + I] >> 4) & 0xF, true)
|
|
|
|
<< hexdigit(Contents[Addr + I] & 0xF, true);
|
2011-10-18 01:13:22 +08:00
|
|
|
else
|
|
|
|
outs() << " ";
|
|
|
|
}
|
|
|
|
// Print ascii.
|
|
|
|
outs() << " ";
|
2019-01-15 17:19:18 +08:00
|
|
|
for (std::size_t I = 0; I < 16 && Addr + I < End; ++I) {
|
|
|
|
if (isPrint(static_cast<unsigned char>(Contents[Addr + I]) & 0xFF))
|
|
|
|
outs() << Contents[Addr + I];
|
2011-10-18 01:13:22 +08:00
|
|
|
else
|
|
|
|
outs() << ".";
|
|
|
|
}
|
|
|
|
outs() << "\n";
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
void objdump::printSymbolTable(const ObjectFile *O, StringRef ArchiveName,
|
|
|
|
StringRef ArchitectureName, bool DumpDynamic) {
|
2020-04-05 09:58:53 +08:00
|
|
|
if (O->isCOFF() && !DumpDynamic) {
|
|
|
|
outs() << "SYMBOL TABLE:\n";
|
|
|
|
printCOFFSymbolTable(cast<const COFFObjectFile>(O));
|
2014-03-19 02:58:51 +08:00
|
|
|
return;
|
|
|
|
}
|
2019-01-11 00:24:10 +08:00
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
const StringRef FileName = O->getFileName();
|
llvm-objdump should ignore Mach-O stab symbols for disassembly.
Summary:
llvm-objdump will commonly error out when disassembling a Mach-O binary with
stab symbols, or when printing a Mach-O symbol table that includesstab symbols.
That is because the Mach-O N_OSO symbol has been modified to include the
bottom 8-bit value of the Mach-O's cpusubtype value in the section field. In
general, one cannot blindly assume a stab symbol's section field is valid
unless one has actually consulted the specification for the specific stab.
Since objdump mostly just walks the symbol table to get mnemonics for code
disassembly it's best for objdump to just ignore stab symbols. llvm-nm will
do a more complete and correct job of displaying Mach-O symbol table contents.
Reviewers: pete, lhames, ab, thegameg, jhenderson, MaskRay
Reviewed By: thegameg, MaskRay
Subscribers: MaskRay, rupprecht, seiya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71394
2019-12-12 14:43:46 +08:00
|
|
|
|
2020-04-05 09:58:53 +08:00
|
|
|
if (!DumpDynamic) {
|
|
|
|
outs() << "SYMBOL TABLE:\n";
|
|
|
|
for (auto I = O->symbol_begin(); I != O->symbol_end(); ++I)
|
|
|
|
printSymbol(O, *I, FileName, ArchiveName, ArchitectureName, DumpDynamic);
|
|
|
|
return;
|
|
|
|
}
|
llvm-objdump should ignore Mach-O stab symbols for disassembly.
Summary:
llvm-objdump will commonly error out when disassembling a Mach-O binary with
stab symbols, or when printing a Mach-O symbol table that includesstab symbols.
That is because the Mach-O N_OSO symbol has been modified to include the
bottom 8-bit value of the Mach-O's cpusubtype value in the section field. In
general, one cannot blindly assume a stab symbol's section field is valid
unless one has actually consulted the specification for the specific stab.
Since objdump mostly just walks the symbol table to get mnemonics for code
disassembly it's best for objdump to just ignore stab symbols. llvm-nm will
do a more complete and correct job of displaying Mach-O symbol table contents.
Reviewers: pete, lhames, ab, thegameg, jhenderson, MaskRay
Reviewed By: thegameg, MaskRay
Subscribers: MaskRay, rupprecht, seiya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71394
2019-12-12 14:43:46 +08:00
|
|
|
|
2020-04-05 09:58:53 +08:00
|
|
|
outs() << "DYNAMIC SYMBOL TABLE:\n";
|
|
|
|
if (!O->isELF()) {
|
|
|
|
reportWarning(
|
|
|
|
"this operation is not currently supported for this file format",
|
|
|
|
FileName);
|
|
|
|
return;
|
|
|
|
}
|
2019-08-14 19:10:11 +08:00
|
|
|
|
2020-04-05 09:58:53 +08:00
|
|
|
const ELFObjectFileBase *ELF = cast<const ELFObjectFileBase>(O);
|
|
|
|
for (auto I = ELF->getDynamicSymbolIterators().begin();
|
|
|
|
I != ELF->getDynamicSymbolIterators().end(); ++I)
|
|
|
|
printSymbol(O, *I, FileName, ArchiveName, ArchitectureName, DumpDynamic);
|
|
|
|
}
|
2011-10-19 03:32:17 +08:00
|
|
|
|
2020-05-31 09:00:14 +08:00
|
|
|
void objdump::printSymbol(const ObjectFile *O, const SymbolRef &Symbol,
|
|
|
|
StringRef FileName, StringRef ArchiveName,
|
|
|
|
StringRef ArchitectureName, bool DumpDynamic) {
|
2020-04-05 09:58:53 +08:00
|
|
|
const MachOObjectFile *MachO = dyn_cast<const MachOObjectFile>(O);
|
|
|
|
uint64_t Address = unwrapOrError(Symbol.getAddress(), FileName, ArchiveName,
|
|
|
|
ArchitectureName);
|
|
|
|
if ((Address < StartAddress) || (Address > StopAddress))
|
|
|
|
return;
|
|
|
|
SymbolRef::Type Type =
|
|
|
|
unwrapOrError(Symbol.getType(), FileName, ArchiveName, ArchitectureName);
|
2020-04-10 20:24:21 +08:00
|
|
|
uint32_t Flags =
|
|
|
|
unwrapOrError(Symbol.getFlags(), FileName, ArchiveName, ArchitectureName);
|
2020-04-05 09:58:53 +08:00
|
|
|
|
|
|
|
// Don't ask a Mach-O STAB symbol for its section unless you know that
|
|
|
|
// STAB symbol's section field refers to a valid section index. Otherwise
|
|
|
|
// the symbol may error trying to load a section that does not exist.
|
|
|
|
bool IsSTAB = false;
|
|
|
|
if (MachO) {
|
|
|
|
DataRefImpl SymDRI = Symbol.getRawDataRefImpl();
|
|
|
|
uint8_t NType =
|
|
|
|
(MachO->is64Bit() ? MachO->getSymbol64TableEntry(SymDRI).n_type
|
|
|
|
: MachO->getSymbolTableEntry(SymDRI).n_type);
|
|
|
|
if (NType & MachO::N_STAB)
|
|
|
|
IsSTAB = true;
|
|
|
|
}
|
|
|
|
section_iterator Section = IsSTAB
|
|
|
|
? O->section_end()
|
|
|
|
: unwrapOrError(Symbol.getSection(), FileName,
|
|
|
|
ArchiveName, ArchitectureName);
|
|
|
|
|
|
|
|
StringRef Name;
|
|
|
|
if (Type == SymbolRef::ST_Debug && Section != O->section_end()) {
|
|
|
|
if (Expected<StringRef> NameOrErr = Section->getName())
|
|
|
|
Name = *NameOrErr;
|
|
|
|
else
|
|
|
|
consumeError(NameOrErr.takeError());
|
2015-06-23 23:45:38 +08:00
|
|
|
|
2020-04-05 09:58:53 +08:00
|
|
|
} else {
|
|
|
|
Name = unwrapOrError(Symbol.getName(), FileName, ArchiveName,
|
|
|
|
ArchitectureName);
|
|
|
|
}
|
|
|
|
|
|
|
|
bool Global = Flags & SymbolRef::SF_Global;
|
|
|
|
bool Weak = Flags & SymbolRef::SF_Weak;
|
|
|
|
bool Absolute = Flags & SymbolRef::SF_Absolute;
|
|
|
|
bool Common = Flags & SymbolRef::SF_Common;
|
|
|
|
bool Hidden = Flags & SymbolRef::SF_Hidden;
|
|
|
|
|
|
|
|
char GlobLoc = ' ';
|
|
|
|
if ((Section != O->section_end() || Absolute) && !Weak)
|
|
|
|
GlobLoc = Global ? 'g' : 'l';
|
|
|
|
char IFunc = ' ';
|
2020-04-05 12:31:22 +08:00
|
|
|
if (O->isELF()) {
|
2020-04-05 09:58:53 +08:00
|
|
|
if (ELFSymbolRef(Symbol).getELFType() == ELF::STT_GNU_IFUNC)
|
|
|
|
IFunc = 'i';
|
|
|
|
if (ELFSymbolRef(Symbol).getBinding() == ELF::STB_GNU_UNIQUE)
|
|
|
|
GlobLoc = 'u';
|
|
|
|
}
|
|
|
|
|
|
|
|
char Debug = ' ';
|
|
|
|
if (DumpDynamic)
|
|
|
|
Debug = 'D';
|
|
|
|
else if (Type == SymbolRef::ST_Debug || Type == SymbolRef::ST_File)
|
|
|
|
Debug = 'd';
|
|
|
|
|
|
|
|
char FileFunc = ' ';
|
|
|
|
if (Type == SymbolRef::ST_File)
|
|
|
|
FileFunc = 'f';
|
|
|
|
else if (Type == SymbolRef::ST_Function)
|
|
|
|
FileFunc = 'F';
|
|
|
|
else if (Type == SymbolRef::ST_Data)
|
|
|
|
FileFunc = 'O';
|
|
|
|
|
|
|
|
const char *Fmt = O->getBytesInAddress() > 4 ? "%016" PRIx64 : "%08" PRIx64;
|
|
|
|
|
|
|
|
outs() << format(Fmt, Address) << " "
|
|
|
|
<< GlobLoc // Local -> 'l', Global -> 'g', Neither -> ' '
|
|
|
|
<< (Weak ? 'w' : ' ') // Weak?
|
|
|
|
<< ' ' // Constructor. Not supported yet.
|
|
|
|
<< ' ' // Warning. Not supported yet.
|
|
|
|
<< IFunc // Indirect reference to another symbol.
|
|
|
|
<< Debug // Debugging (d) or dynamic (D) symbol.
|
|
|
|
<< FileFunc // Name of function (F), file (f) or object (O).
|
|
|
|
<< ' ';
|
|
|
|
if (Absolute) {
|
|
|
|
outs() << "*ABS*";
|
|
|
|
} else if (Common) {
|
|
|
|
outs() << "*COM*";
|
|
|
|
} else if (Section == O->section_end()) {
|
|
|
|
outs() << "*UND*";
|
|
|
|
} else {
|
2020-09-04 08:07:59 +08:00
|
|
|
StringRef SegmentName = getSegmentName(MachO, *Section);
|
|
|
|
if (!SegmentName.empty())
|
2020-04-05 09:58:53 +08:00
|
|
|
outs() << SegmentName << ",";
|
|
|
|
StringRef SectionName = unwrapOrError(Section->getName(), FileName);
|
|
|
|
outs() << SectionName;
|
|
|
|
}
|
2015-06-23 23:45:38 +08:00
|
|
|
|
2020-04-05 12:31:22 +08:00
|
|
|
if (Common || O->isELF()) {
|
2020-04-05 09:58:53 +08:00
|
|
|
uint64_t Val =
|
|
|
|
Common ? Symbol.getAlignment() : ELFSymbolRef(Symbol).getSize();
|
|
|
|
outs() << '\t' << format(Fmt, Val);
|
|
|
|
}
|
|
|
|
|
2020-04-05 12:31:22 +08:00
|
|
|
if (O->isELF()) {
|
2020-04-05 09:58:53 +08:00
|
|
|
uint8_t Other = ELFSymbolRef(Symbol).getOther();
|
|
|
|
switch (Other) {
|
|
|
|
case ELF::STV_DEFAULT:
|
|
|
|
break;
|
|
|
|
case ELF::STV_INTERNAL:
|
|
|
|
outs() << " .internal";
|
|
|
|
break;
|
|
|
|
case ELF::STV_HIDDEN:
|
2019-05-11 00:24:57 +08:00
|
|
|
outs() << " .hidden";
|
2020-04-05 09:58:53 +08:00
|
|
|
break;
|
|
|
|
case ELF::STV_PROTECTED:
|
|
|
|
outs() << " .protected";
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
outs() << format(" 0x%02x", Other);
|
|
|
|
break;
|
2019-05-11 00:24:57 +08:00
|
|
|
}
|
2020-04-05 09:58:53 +08:00
|
|
|
} else if (Hidden) {
|
|
|
|
outs() << " .hidden";
|
2011-10-19 03:32:17 +08:00
|
|
|
}
|
2020-04-05 09:58:53 +08:00
|
|
|
|
|
|
|
if (Demangle)
|
|
|
|
outs() << ' ' << demangle(std::string(Name)) << '\n';
|
|
|
|
else
|
|
|
|
outs() << ' ' << Name << '\n';
|
2011-10-19 03:32:17 +08:00
|
|
|
}
|
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
static void printUnwindInfo(const ObjectFile *O) {
|
2012-12-06 04:12:35 +08:00
|
|
|
outs() << "Unwind info:\n\n";
|
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
if (const COFFObjectFile *Coff = dyn_cast<COFFObjectFile>(O))
|
|
|
|
printCOFFUnwindInfo(Coff);
|
|
|
|
else if (const MachOObjectFile *MachO = dyn_cast<MachOObjectFile>(O))
|
2014-08-01 21:07:19 +08:00
|
|
|
printMachOUnwindInfo(MachO);
|
2019-01-15 17:19:18 +08:00
|
|
|
else
|
2012-12-06 04:12:35 +08:00
|
|
|
// TODO: Extract DWARF dump tool to objdump.
|
2018-11-12 06:12:04 +08:00
|
|
|
WithColor::error(errs(), ToolName)
|
|
|
|
<< "This operation is only currently supported "
|
|
|
|
"for COFF and MachO object files.\n";
|
2012-12-06 04:12:35 +08:00
|
|
|
}
|
|
|
|
|
2015-07-08 10:04:15 +08:00
|
|
|
/// Dump the raw contents of the __clangast section so the output can be piped
|
|
|
|
/// into llvm-bcanalyzer.
|
2020-05-31 09:00:14 +08:00
|
|
|
static void printRawClangAST(const ObjectFile *Obj) {
|
2015-07-08 10:04:15 +08:00
|
|
|
if (outs().is_displayed()) {
|
2018-11-12 06:12:04 +08:00
|
|
|
WithColor::error(errs(), ToolName)
|
|
|
|
<< "The -raw-clang-ast option will dump the raw binary contents of "
|
|
|
|
"the clang ast section.\n"
|
|
|
|
"Please redirect the output to a file or another program such as "
|
|
|
|
"llvm-bcanalyzer.\n";
|
2015-07-08 10:04:15 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
StringRef ClangASTSectionName("__clangast");
|
2020-04-05 12:31:22 +08:00
|
|
|
if (Obj->isCOFF()) {
|
2015-07-08 10:04:15 +08:00
|
|
|
ClangASTSectionName = "clangast";
|
|
|
|
}
|
|
|
|
|
|
|
|
Optional<object::SectionRef> ClangASTSection;
|
2015-07-29 23:45:39 +08:00
|
|
|
for (auto Sec : ToolSectionFilter(*Obj)) {
|
2015-07-08 10:04:15 +08:00
|
|
|
StringRef Name;
|
2019-08-14 19:10:11 +08:00
|
|
|
if (Expected<StringRef> NameOrErr = Sec.getName())
|
|
|
|
Name = *NameOrErr;
|
|
|
|
else
|
|
|
|
consumeError(NameOrErr.takeError());
|
|
|
|
|
2015-07-08 10:04:15 +08:00
|
|
|
if (Name == ClangASTSectionName) {
|
|
|
|
ClangASTSection = Sec;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (!ClangASTSection)
|
|
|
|
return;
|
|
|
|
|
2019-05-16 21:24:04 +08:00
|
|
|
StringRef ClangASTContents = unwrapOrError(
|
|
|
|
ClangASTSection.getValue().getContents(), Obj->getFileName());
|
2015-07-08 10:04:15 +08:00
|
|
|
outs().write(ClangASTContents.data(), ClangASTContents.size());
|
|
|
|
}
|
|
|
|
|
2015-06-23 02:03:02 +08:00
|
|
|
static void printFaultMaps(const ObjectFile *Obj) {
|
2019-01-15 17:19:18 +08:00
|
|
|
StringRef FaultMapSectionName;
|
2015-06-23 02:03:02 +08:00
|
|
|
|
2020-04-05 12:31:22 +08:00
|
|
|
if (Obj->isELF()) {
|
2015-06-23 02:03:02 +08:00
|
|
|
FaultMapSectionName = ".llvm_faultmaps";
|
2020-04-05 12:31:22 +08:00
|
|
|
} else if (Obj->isMachO()) {
|
2015-06-23 02:03:02 +08:00
|
|
|
FaultMapSectionName = "__llvm_faultmaps";
|
|
|
|
} else {
|
2018-11-12 06:12:04 +08:00
|
|
|
WithColor::error(errs(), ToolName)
|
|
|
|
<< "This operation is only currently supported "
|
|
|
|
"for ELF and Mach-O executable files.\n";
|
2015-06-23 02:03:02 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
Optional<object::SectionRef> FaultMapSection;
|
|
|
|
|
2015-07-29 23:45:39 +08:00
|
|
|
for (auto Sec : ToolSectionFilter(*Obj)) {
|
2015-06-23 02:03:02 +08:00
|
|
|
StringRef Name;
|
2019-08-14 19:10:11 +08:00
|
|
|
if (Expected<StringRef> NameOrErr = Sec.getName())
|
|
|
|
Name = *NameOrErr;
|
|
|
|
else
|
|
|
|
consumeError(NameOrErr.takeError());
|
|
|
|
|
2015-06-23 02:03:02 +08:00
|
|
|
if (Name == FaultMapSectionName) {
|
|
|
|
FaultMapSection = Sec;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
outs() << "FaultMap table:\n";
|
|
|
|
|
|
|
|
if (!FaultMapSection.hasValue()) {
|
|
|
|
outs() << "<not found>\n";
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2019-05-16 21:24:04 +08:00
|
|
|
StringRef FaultMapContents =
|
|
|
|
unwrapOrError(FaultMapSection.getValue().getContents(), Obj->getFileName());
|
2015-06-23 02:03:02 +08:00
|
|
|
FaultMapParser FMP(FaultMapContents.bytes_begin(),
|
|
|
|
FaultMapContents.bytes_end());
|
|
|
|
|
|
|
|
outs() << FMP;
|
|
|
|
}
|
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
static void printPrivateFileHeaders(const ObjectFile *O, bool OnlyFirst) {
|
|
|
|
if (O->isELF()) {
|
|
|
|
printELFFileHeader(O);
|
2019-02-25 21:13:19 +08:00
|
|
|
printELFDynamicSection(O);
|
|
|
|
printELFSymbolVersionInfo(O);
|
|
|
|
return;
|
2018-07-25 19:09:20 +08:00
|
|
|
}
|
2019-01-15 17:19:18 +08:00
|
|
|
if (O->isCOFF())
|
|
|
|
return printCOFFFileHeader(O);
|
|
|
|
if (O->isWasm())
|
|
|
|
return printWasmFileHeader(O);
|
|
|
|
if (O->isMachO()) {
|
|
|
|
printMachOFileHeader(O);
|
|
|
|
if (!OnlyFirst)
|
|
|
|
printMachOLoadCommands(O);
|
2016-09-18 12:39:15 +08:00
|
|
|
return;
|
|
|
|
}
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(O->getFileName(), "Invalid/Unsupported object file format");
|
2013-09-28 05:04:00 +08:00
|
|
|
}
|
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
static void printFileHeaders(const ObjectFile *O) {
|
|
|
|
if (!O->isELF() && !O->isCOFF())
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(O->getFileName(), "Invalid/Unsupported object file format");
|
2018-07-04 23:25:03 +08:00
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
Triple::ArchType AT = O->getArch();
|
2018-07-04 23:25:03 +08:00
|
|
|
outs() << "architecture: " << Triple::getArchTypeName(AT) << "\n";
|
2019-04-07 16:19:55 +08:00
|
|
|
uint64_t Address = unwrapOrError(O->getStartAddress(), O->getFileName());
|
2018-10-20 06:16:49 +08:00
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
StringRef Fmt = O->getBytesInAddress() > 4 ? "%016" PRIx64 : "%08" PRIx64;
|
2018-07-04 23:25:03 +08:00
|
|
|
outs() << "start address: "
|
2019-01-12 20:17:24 +08:00
|
|
|
<< "0x" << format(Fmt.data(), Address) << "\n\n";
|
2018-07-04 23:25:03 +08:00
|
|
|
}
|
|
|
|
|
2018-07-05 22:43:29 +08:00
|
|
|
static void printArchiveChild(StringRef Filename, const Archive::Child &C) {
|
|
|
|
Expected<sys::fs::perms> ModeOrErr = C.getAccessMode();
|
|
|
|
if (!ModeOrErr) {
|
2018-11-12 06:12:04 +08:00
|
|
|
WithColor::error(errs(), ToolName) << "ill-formed archive entry.\n";
|
2018-07-05 22:43:29 +08:00
|
|
|
consumeError(ModeOrErr.takeError());
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
sys::fs::perms Mode = ModeOrErr.get();
|
|
|
|
outs() << ((Mode & sys::fs::owner_read) ? "r" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::owner_write) ? "w" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::owner_exe) ? "x" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::group_read) ? "r" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::group_write) ? "w" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::group_exe) ? "x" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::others_read) ? "r" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::others_write) ? "w" : "-");
|
|
|
|
outs() << ((Mode & sys::fs::others_exe) ? "x" : "-");
|
|
|
|
|
|
|
|
outs() << " ";
|
|
|
|
|
2019-04-07 16:19:55 +08:00
|
|
|
outs() << format("%d/%d %6" PRId64 " ", unwrapOrError(C.getUID(), Filename),
|
|
|
|
unwrapOrError(C.getGID(), Filename),
|
|
|
|
unwrapOrError(C.getRawSize(), Filename));
|
2018-07-05 22:43:29 +08:00
|
|
|
|
|
|
|
StringRef RawLastModified = C.getRawLastModified();
|
|
|
|
unsigned Seconds;
|
|
|
|
if (RawLastModified.getAsInteger(10, Seconds))
|
|
|
|
outs() << "(date: \"" << RawLastModified
|
|
|
|
<< "\" contains non-decimal chars) ";
|
|
|
|
else {
|
|
|
|
// Since ctime(3) returns a 26 character string of the form:
|
|
|
|
// "Sun Sep 16 01:03:52 1973\n\0"
|
|
|
|
// just print 24 characters.
|
|
|
|
time_t t = Seconds;
|
|
|
|
outs() << format("%.24s ", ctime(&t));
|
|
|
|
}
|
|
|
|
|
|
|
|
StringRef Name = "";
|
|
|
|
Expected<StringRef> NameOrErr = C.getName();
|
|
|
|
if (!NameOrErr) {
|
|
|
|
consumeError(NameOrErr.takeError());
|
2019-04-07 16:19:55 +08:00
|
|
|
Name = unwrapOrError(C.getRawName(), Filename);
|
2018-07-05 22:43:29 +08:00
|
|
|
} else {
|
|
|
|
Name = NameOrErr.get();
|
|
|
|
}
|
|
|
|
outs() << Name << "\n";
|
|
|
|
}
|
|
|
|
|
2019-07-25 00:55:30 +08:00
|
|
|
// For ELF only now.
|
|
|
|
static bool shouldWarnForInvalidStartStopAddress(ObjectFile *Obj) {
|
|
|
|
if (const auto *Elf = dyn_cast<ELFObjectFileBase>(Obj)) {
|
|
|
|
if (Elf->getEType() != ELF::ET_REL)
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void checkForInvalidStartStopAddress(ObjectFile *Obj,
|
|
|
|
uint64_t Start, uint64_t Stop) {
|
|
|
|
if (!shouldWarnForInvalidStartStopAddress(Obj))
|
|
|
|
return;
|
|
|
|
|
|
|
|
for (const SectionRef &Section : Obj->sections())
|
|
|
|
if (ELFSectionRef(Section).getFlags() & ELF::SHF_ALLOC) {
|
|
|
|
uint64_t BaseAddr = Section.getAddress();
|
|
|
|
uint64_t Size = Section.getSize();
|
|
|
|
if ((Start < BaseAddr + Size) && Stop > BaseAddr)
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (StartAddress.getNumOccurrences() == 0)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportWarning("no section has address less than 0x" +
|
|
|
|
Twine::utohexstr(Stop) + " specified by --stop-address",
|
|
|
|
Obj->getFileName());
|
2019-07-25 00:55:30 +08:00
|
|
|
else if (StopAddress.getNumOccurrences() == 0)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportWarning("no section has address greater than or equal to 0x" +
|
|
|
|
Twine::utohexstr(Start) + " specified by --start-address",
|
|
|
|
Obj->getFileName());
|
2019-07-25 00:55:30 +08:00
|
|
|
else
|
2019-08-21 19:07:31 +08:00
|
|
|
reportWarning("no section overlaps the range [0x" +
|
|
|
|
Twine::utohexstr(Start) + ",0x" + Twine::utohexstr(Stop) +
|
|
|
|
") specified by --start-address/--stop-address",
|
|
|
|
Obj->getFileName());
|
2019-07-25 00:55:30 +08:00
|
|
|
}
|
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
static void dumpObject(ObjectFile *O, const Archive *A = nullptr,
|
|
|
|
const Archive::Child *C = nullptr) {
|
2015-07-08 10:04:15 +08:00
|
|
|
// Avoid other output when using a raw option.
|
|
|
|
if (!RawClangAST) {
|
|
|
|
outs() << '\n';
|
2019-01-15 17:19:18 +08:00
|
|
|
if (A)
|
|
|
|
outs() << A->getFileName() << "(" << O->getFileName() << ")";
|
2016-05-18 01:10:12 +08:00
|
|
|
else
|
2019-01-15 17:19:18 +08:00
|
|
|
outs() << O->getFileName();
|
[llvm-objdump] Print file format in lowercase to match GNU output.
Summary:
GNU objdump prints the file format in lowercase, e.g. `elf64-x86-64`. llvm-objdump prints `ELF64-x86-64` right now, even though piping that into llvm-objcopy refuses that as a valid arch to use.
As an example of a problem this causes, see: https://github.com/ClangBuiltLinux/linux/issues/779
Reviewers: MaskRay, jhenderson, alexshap
Reviewed By: MaskRay
Subscribers: tpimh, sbc100, grimar, jvesely, nhaehnle, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74433
2020-02-12 03:55:40 +08:00
|
|
|
outs() << ":\tfile format " << O->getFileFormatName().lower() << "\n\n";
|
2015-07-08 10:04:15 +08:00
|
|
|
}
|
2011-10-18 01:13:22 +08:00
|
|
|
|
2019-07-25 00:55:30 +08:00
|
|
|
if (StartAddress.getNumOccurrences() || StopAddress.getNumOccurrences())
|
|
|
|
checkForInvalidStartStopAddress(O, StartAddress, StopAddress);
|
|
|
|
|
[llvm-objdump] Further rearrange llvm-objdump sections for compatability
Summary:
rL371826 rearranged some output from llvm-objdump for GNU objdump compatability, but there still seem to be some more.
I think this rearrangement is a little closer. Overview of the ordering which matches GNU objdump:
* Archive headers
* File headers
* Section headers
* Symbol table
* Dwarf debugging
* Relocations (if `--disassemble` is not used)
* Section contents
* Disassembly
Reviewers: jhenderson, justice_adams, grimar, ychen, espindola
Reviewed By: jhenderson
Subscribers: aprantl, emaste, arichardson, jrtc27, atanasyan, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68066
llvm-svn: 373671
2019-10-04 06:01:08 +08:00
|
|
|
// Note: the order here matches GNU objdump for compatability.
|
2019-01-15 17:19:18 +08:00
|
|
|
StringRef ArchiveName = A ? A->getFileName() : "";
|
|
|
|
if (ArchiveHeaders && !MachOOpt && C)
|
|
|
|
printArchiveChild(ArchiveName, *C);
|
[llvm-objdump] Further rearrange llvm-objdump sections for compatability
Summary:
rL371826 rearranged some output from llvm-objdump for GNU objdump compatability, but there still seem to be some more.
I think this rearrangement is a little closer. Overview of the ordering which matches GNU objdump:
* Archive headers
* File headers
* Section headers
* Symbol table
* Dwarf debugging
* Relocations (if `--disassemble` is not used)
* Section contents
* Disassembly
Reviewers: jhenderson, justice_adams, grimar, ychen, espindola
Reviewed By: jhenderson
Subscribers: aprantl, emaste, arichardson, jrtc27, atanasyan, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68066
llvm-svn: 373671
2019-10-04 06:01:08 +08:00
|
|
|
if (FileHeaders)
|
|
|
|
printFileHeaders(O);
|
2019-09-13 16:56:28 +08:00
|
|
|
if (PrivateHeaders || FirstPrivateHeader)
|
|
|
|
printPrivateFileHeaders(O, FirstPrivateHeader);
|
2011-10-11 05:21:34 +08:00
|
|
|
if (SectionHeaders)
|
2019-01-15 17:19:18 +08:00
|
|
|
printSectionHeaders(O);
|
2011-10-19 03:32:17 +08:00
|
|
|
if (SymbolTable)
|
2019-01-15 17:19:18 +08:00
|
|
|
printSymbolTable(O, ArchiveName);
|
2020-04-05 09:58:53 +08:00
|
|
|
if (DynamicSymbolTable)
|
|
|
|
printSymbolTable(O, ArchiveName, /*ArchitectureName=*/"",
|
|
|
|
/*DumpDynamic=*/true);
|
[llvm-objdump] Further rearrange llvm-objdump sections for compatability
Summary:
rL371826 rearranged some output from llvm-objdump for GNU objdump compatability, but there still seem to be some more.
I think this rearrangement is a little closer. Overview of the ordering which matches GNU objdump:
* Archive headers
* File headers
* Section headers
* Symbol table
* Dwarf debugging
* Relocations (if `--disassemble` is not used)
* Section contents
* Disassembly
Reviewers: jhenderson, justice_adams, grimar, ychen, espindola
Reviewed By: jhenderson
Subscribers: aprantl, emaste, arichardson, jrtc27, atanasyan, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68066
llvm-svn: 373671
2019-10-04 06:01:08 +08:00
|
|
|
if (DwarfDumpType != DIDT_Null) {
|
|
|
|
std::unique_ptr<DIContext> DICtx = DWARFContext::create(*O);
|
|
|
|
// Dump the complete DWARF structure.
|
|
|
|
DIDumpOptions DumpOpts;
|
|
|
|
DumpOpts.DumpType = DwarfDumpType;
|
|
|
|
DICtx->dump(outs(), DumpOpts);
|
|
|
|
}
|
|
|
|
if (Relocations && !Disassemble)
|
|
|
|
printRelocations(O);
|
|
|
|
if (DynamicRelocations)
|
|
|
|
printDynamicRelocations(O);
|
|
|
|
if (SectionContents)
|
|
|
|
printSectionContents(O);
|
|
|
|
if (Disassemble)
|
|
|
|
disassembleObject(O, Relocations);
|
2012-12-06 04:12:35 +08:00
|
|
|
if (UnwindInfo)
|
2019-01-15 17:19:18 +08:00
|
|
|
printUnwindInfo(O);
|
[llvm-objdump] Further rearrange llvm-objdump sections for compatability
Summary:
rL371826 rearranged some output from llvm-objdump for GNU objdump compatability, but there still seem to be some more.
I think this rearrangement is a little closer. Overview of the ordering which matches GNU objdump:
* Archive headers
* File headers
* Section headers
* Symbol table
* Dwarf debugging
* Relocations (if `--disassemble` is not used)
* Section contents
* Disassembly
Reviewers: jhenderson, justice_adams, grimar, ychen, espindola
Reviewed By: jhenderson
Subscribers: aprantl, emaste, arichardson, jrtc27, atanasyan, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68066
llvm-svn: 373671
2019-10-04 06:01:08 +08:00
|
|
|
|
|
|
|
// Mach-O specific options:
|
2014-08-30 08:20:14 +08:00
|
|
|
if (ExportsTrie)
|
2019-01-15 17:19:18 +08:00
|
|
|
printExportsTrie(O);
|
2014-09-13 05:34:15 +08:00
|
|
|
if (Rebase)
|
2019-01-15 17:19:18 +08:00
|
|
|
printRebaseTable(O);
|
2014-09-16 09:41:51 +08:00
|
|
|
if (Bind)
|
2019-01-15 17:19:18 +08:00
|
|
|
printBindTable(O);
|
2014-09-16 09:41:51 +08:00
|
|
|
if (LazyBind)
|
2019-01-15 17:19:18 +08:00
|
|
|
printLazyBindTable(O);
|
2014-09-16 09:41:51 +08:00
|
|
|
if (WeakBind)
|
2019-01-15 17:19:18 +08:00
|
|
|
printWeakBindTable(O);
|
[llvm-objdump] Further rearrange llvm-objdump sections for compatability
Summary:
rL371826 rearranged some output from llvm-objdump for GNU objdump compatability, but there still seem to be some more.
I think this rearrangement is a little closer. Overview of the ordering which matches GNU objdump:
* Archive headers
* File headers
* Section headers
* Symbol table
* Dwarf debugging
* Relocations (if `--disassemble` is not used)
* Section contents
* Disassembly
Reviewers: jhenderson, justice_adams, grimar, ychen, espindola
Reviewed By: jhenderson
Subscribers: aprantl, emaste, arichardson, jrtc27, atanasyan, seiya, llvm-commits, MaskRay
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68066
llvm-svn: 373671
2019-10-04 06:01:08 +08:00
|
|
|
|
|
|
|
// Other special sections:
|
2015-07-08 10:04:15 +08:00
|
|
|
if (RawClangAST)
|
2019-01-15 17:19:18 +08:00
|
|
|
printRawClangAST(O);
|
2019-04-15 23:00:10 +08:00
|
|
|
if (FaultMapSection)
|
2019-01-15 17:19:18 +08:00
|
|
|
printFaultMaps(O);
|
2011-10-08 08:18:30 +08:00
|
|
|
}
|
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
static void dumpObject(const COFFImportFile *I, const Archive *A,
|
2018-07-05 22:43:29 +08:00
|
|
|
const Archive::Child *C = nullptr) {
|
2016-08-19 00:39:19 +08:00
|
|
|
StringRef ArchiveName = A ? A->getFileName() : "";
|
|
|
|
|
|
|
|
// Avoid other output when using a raw option.
|
|
|
|
if (!RawClangAST)
|
|
|
|
outs() << '\n'
|
|
|
|
<< ArchiveName << "(" << I->getFileName() << ")"
|
|
|
|
<< ":\tfile format COFF-import-file"
|
|
|
|
<< "\n\n";
|
|
|
|
|
2018-10-29 22:17:08 +08:00
|
|
|
if (ArchiveHeaders && !MachOOpt && C)
|
|
|
|
printArchiveChild(ArchiveName, *C);
|
2016-08-19 00:39:19 +08:00
|
|
|
if (SymbolTable)
|
|
|
|
printCOFFSymbolTable(I);
|
|
|
|
}
|
|
|
|
|
2018-05-02 00:10:38 +08:00
|
|
|
/// Dump each object file in \a a;
|
2019-01-15 17:19:18 +08:00
|
|
|
static void dumpArchive(const Archive *A) {
|
2016-11-11 12:28:40 +08:00
|
|
|
Error Err = Error::success();
|
2019-08-20 21:19:16 +08:00
|
|
|
unsigned I = -1;
|
2019-01-15 17:19:18 +08:00
|
|
|
for (auto &C : A->children(Err)) {
|
2019-08-20 21:19:16 +08:00
|
|
|
++I;
|
2016-05-18 01:10:12 +08:00
|
|
|
Expected<std::unique_ptr<Binary>> ChildOrErr = C.getAsBinary();
|
|
|
|
if (!ChildOrErr) {
|
|
|
|
if (auto E = isNotObjectErrorInvalidFileType(ChildOrErr.takeError()))
|
2019-08-27 18:03:45 +08:00
|
|
|
reportError(std::move(E), getFileNameForError(C, I), A->getFileName());
|
2016-05-18 01:10:12 +08:00
|
|
|
continue;
|
|
|
|
}
|
2019-01-15 17:19:18 +08:00
|
|
|
if (ObjectFile *O = dyn_cast<ObjectFile>(&*ChildOrErr.get()))
|
|
|
|
dumpObject(O, A, &C);
|
2016-08-19 00:39:19 +08:00
|
|
|
else if (COFFImportFile *I = dyn_cast<COFFImportFile>(&*ChildOrErr.get()))
|
2019-01-15 17:19:18 +08:00
|
|
|
dumpObject(I, A, &C);
|
2011-10-08 08:18:30 +08:00
|
|
|
else
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(errorCodeToError(object_error::invalid_file_type),
|
|
|
|
A->getFileName());
|
2011-10-08 08:18:30 +08:00
|
|
|
}
|
2016-07-14 10:24:01 +08:00
|
|
|
if (Err)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(std::move(Err), A->getFileName());
|
2011-10-08 08:18:30 +08:00
|
|
|
}
|
|
|
|
|
2018-05-02 00:10:38 +08:00
|
|
|
/// Open file and figure out how to dump it.
|
2019-01-15 17:19:18 +08:00
|
|
|
static void dumpInput(StringRef file) {
|
2015-01-08 05:02:18 +08:00
|
|
|
// If we are using the Mach-O specific object file parser, then let it parse
|
|
|
|
// the file and process the command line options. So the -arch flags can
|
|
|
|
// be used to select specific slices, etc.
|
|
|
|
if (MachOOpt) {
|
2019-01-15 17:19:18 +08:00
|
|
|
parseInputMachO(file);
|
2011-10-08 08:18:30 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Attempt to open the binary.
|
2019-04-07 16:19:55 +08:00
|
|
|
OwningBinary<Binary> OBinary = unwrapOrError(createBinary(file), file);
|
|
|
|
Binary &Binary = *OBinary.getBinary();
|
2011-10-08 08:18:30 +08:00
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
if (Archive *A = dyn_cast<Archive>(&Binary))
|
|
|
|
dumpArchive(A);
|
|
|
|
else if (ObjectFile *O = dyn_cast<ObjectFile>(&Binary))
|
|
|
|
dumpObject(O);
|
2018-08-03 08:06:38 +08:00
|
|
|
else if (MachOUniversalBinary *UB = dyn_cast<MachOUniversalBinary>(&Binary))
|
2019-01-15 17:19:18 +08:00
|
|
|
parseInputMachO(UB);
|
2012-08-08 01:53:14 +08:00
|
|
|
else
|
2019-08-21 19:07:31 +08:00
|
|
|
reportError(errorCodeToError(object_error::invalid_file_type), file);
|
2011-10-08 08:18:30 +08:00
|
|
|
}
|
|
|
|
|
2011-01-20 14:39:06 +08:00
|
|
|
int main(int argc, char **argv) {
|
2019-04-15 23:31:42 +08:00
|
|
|
using namespace llvm;
|
2018-04-14 02:26:06 +08:00
|
|
|
InitLLVM X(argc, argv);
|
2019-05-22 14:30:46 +08:00
|
|
|
const cl::OptionCategory *OptionFilters[] = {&ObjdumpCat, &MachOCat};
|
|
|
|
cl::HideUnrelatedOptions(OptionFilters);
|
2011-01-20 14:39:06 +08:00
|
|
|
|
|
|
|
// Initialize targets and assembly printers/parsers.
|
2019-04-15 23:31:42 +08:00
|
|
|
InitializeAllTargetInfos();
|
|
|
|
InitializeAllTargetMCs();
|
|
|
|
InitializeAllDisassemblers();
|
2011-01-20 14:39:06 +08:00
|
|
|
|
2012-05-04 07:20:10 +08:00
|
|
|
// Register the target printer for --version.
|
|
|
|
cl::AddExtraVersionPrinter(TargetRegistry::printRegisteredTargetsForVersion);
|
|
|
|
|
2020-03-16 07:47:49 +08:00
|
|
|
cl::ParseCommandLineOptions(argc, argv, "llvm object file dumper\n", nullptr,
|
|
|
|
/*EnvVar=*/nullptr,
|
|
|
|
/*LongOptionsUseDoubleDash=*/true);
|
2011-01-20 14:39:06 +08:00
|
|
|
|
2019-06-22 08:22:57 +08:00
|
|
|
if (StartAddress >= StopAddress)
|
2019-08-21 19:07:31 +08:00
|
|
|
reportCmdLineError("start address should be less than stop address");
|
2019-06-22 08:22:57 +08:00
|
|
|
|
2011-01-20 14:39:06 +08:00
|
|
|
ToolName = argv[0];
|
|
|
|
|
|
|
|
// Defaults to a.out if no filenames specified.
|
2018-12-20 08:57:06 +08:00
|
|
|
if (InputFilenames.empty())
|
2011-01-20 14:39:06 +08:00
|
|
|
InputFilenames.push_back("a.out");
|
|
|
|
|
2020-10-16 22:35:19 +08:00
|
|
|
// Removes trailing separators from prefix.
|
|
|
|
while (!Prefix.empty() && sys::path::is_separator(Prefix.back()))
|
|
|
|
Prefix.pop_back();
|
|
|
|
|
2018-06-28 04:45:11 +08:00
|
|
|
if (AllHeaders)
|
2019-01-18 20:01:59 +08:00
|
|
|
ArchiveHeaders = FileHeaders = PrivateHeaders = Relocations =
|
|
|
|
SectionHeaders = SymbolTable = true;
|
2018-06-28 04:45:11 +08:00
|
|
|
|
2019-05-21 19:05:46 +08:00
|
|
|
if (DisassembleAll || PrintSource || PrintLines ||
|
2020-03-08 04:55:44 +08:00
|
|
|
!DisassembleSymbols.empty())
|
2015-07-24 04:58:49 +08:00
|
|
|
Disassemble = true;
|
2018-07-19 00:39:21 +08:00
|
|
|
|
2019-04-16 10:37:29 +08:00
|
|
|
if (!ArchiveHeaders && !Disassemble && DwarfDumpType == DIDT_Null &&
|
|
|
|
!DynamicRelocations && !FileHeaders && !PrivateHeaders && !RawClangAST &&
|
|
|
|
!Relocations && !SectionHeaders && !SectionContents && !SymbolTable &&
|
2020-04-05 09:58:53 +08:00
|
|
|
!DynamicSymbolTable && !UnwindInfo && !FaultMapSection &&
|
2019-04-16 10:37:29 +08:00
|
|
|
!(MachOOpt &&
|
|
|
|
(Bind || DataInCode || DylibId || DylibsUsed || ExportsTrie ||
|
2021-03-09 10:44:13 +08:00
|
|
|
FirstPrivateHeader || FunctionStarts || IndirectSymbols || InfoPlist ||
|
|
|
|
LazyBind || LinkOptHints || ObjcMetaData || Rebase ||
|
|
|
|
UniversalHeaders || WeakBind || !FilterSections.empty()))) {
|
2011-01-20 14:39:06 +08:00
|
|
|
cl::PrintHelpMessage();
|
2017-12-19 03:46:56 +08:00
|
|
|
return 2;
|
|
|
|
}
|
|
|
|
|
2020-03-08 04:55:44 +08:00
|
|
|
DisasmSymbolSet.insert(DisassembleSymbols.begin(), DisassembleSymbols.end());
|
2018-03-10 03:13:44 +08:00
|
|
|
|
2019-01-15 17:19:18 +08:00
|
|
|
llvm::for_each(InputFilenames, dumpInput);
|
2017-12-19 03:46:56 +08:00
|
|
|
|
2019-07-03 02:38:17 +08:00
|
|
|
warnOnNoMatchForSections();
|
|
|
|
|
2017-12-19 03:46:56 +08:00
|
|
|
return EXIT_SUCCESS;
|
|
|
|
}
|