llvm-project/llvm/docs/CommandGuide/llvm-objdump.rst

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

405 lines
9.8 KiB
ReStructuredText
Raw Normal View History

llvm-objdump - LLVM's object file dumper
========================================
.. program:: llvm-objdump
SYNOPSIS
--------
:program:`llvm-objdump` [*commands*] [*options*] [*filenames...*]
DESCRIPTION
-----------
The :program:`llvm-objdump` utility prints the contents of object files and
final linked images named on the command line. If no file name is specified,
:program:`llvm-objdump` will attempt to read from *a.out*. If *-* is used as a
file name, :program:`llvm-objdump` will process a file on its standard input
stream.
COMMANDS
--------
At least one of the following commands are required, and some commands can be
combined with other commands:
.. option:: -a, --archive-headers
Display the information contained within an archive's headers.
.. option:: -d, --disassemble
Disassemble all text sections found in the input files.
.. option:: -D, --disassemble-all
Disassemble all sections found in the input files.
.. option:: --disassemble-symbols=<symbol1[,symbol2,...]>
Disassemble only the specified symbols. Takes demangled symbol names when
:option:`--demangle` is specified, otherwise takes mangled symbol names.
Implies :option:`--disassemble`.
.. option:: --dwarf=<value>
Dump the specified DWARF debug sections. The supported values are:
`frames` - .debug_frame
.. option:: -f, --file-headers
Display the contents of the overall file header.
.. option:: --fault-map-section
Display the content of the fault map section.
.. option:: -h, --headers, --section-headers
Display summaries of the headers for each section.
.. option:: --help
Display usage information and exit. Does not stack with other commands.
.. option:: -p, --private-headers
Display format-specific file headers.
.. option:: -r, --reloc
Display the relocation entries in the file.
.. option:: -R, --dynamic-reloc
Display the dynamic relocation entries in the file.
.. option:: --raw-clang-ast
Dump the raw binary contents of the clang AST section.
.. option:: -s, --full-contents
Display the contents of each section.
.. option:: -t, --syms
Display the symbol table.
[llvm-objdump] Teach `llvm-objdump` dump dynamic symbols. Summary: This patch is to teach `llvm-objdump` dump dynamic symbols (`-T` and `--dynamic-syms`). Currently, this patch is not fully compatible with `gnu-objdump`, but I would like to continue working on this in next few patches. It has two issues. 1. Some symbols shouldn't be marked as global(g). (`-t/--syms` has same issue as well) (Fixed by D75659) 2. `gnu-objdump` can dump version information and *dynamically* insert before symbol name field. `objdump -T a.out` gives: ``` DYNAMIC SYMBOL TABLE: 0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 printf 0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 __libc_start_main 0000000000000000 w D *UND* 0000000000000000 __gmon_start__ 0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable 0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize ``` `llvm-objdump -T a.out` gives: ``` DYNAMIC SYMBOL TABLE: 0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable 0000000000000000 g DF *UND* 0000000000000000 printf 0000000000000000 g DF *UND* 0000000000000000 __libc_start_main 0000000000000000 w D *UND* 0000000000000000 __gmon_start__ 0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable 0000000000000000 w DF *UND* 0000000000000000 __cxa_finalize ``` Reviewers: jhenderson, grimar, MaskRay, espindola Reviewed By: jhenderson, grimar Subscribers: emaste, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75756
2020-04-05 09:58:53 +08:00
.. option:: -T, --dynamic-syms
Display the contents of the dynamic symbol table.
.. option:: -u, --unwind-info
Display the unwind info of the input(s).
.. option:: -v, --version
Display the version of the :program:`llvm-objdump` executable. Does not stack
with other commands.
.. option:: -x, --all-headers
Display all available header information. Equivalent to specifying
:option:`--archive-headers`, :option:`--file-headers`,
:option:`--private-headers`, :option:`--reloc`, :option:`--section-headers`,
and :option:`--syms`.
OPTIONS
-------
:program:`llvm-objdump` supports the following options:
.. option:: --adjust-vma=<offset>
Increase the displayed address in disassembly or section header printing by
the specified offset.
.. option:: --arch-name=<string>
Specify the target architecture when disassembling. Use :option:`--version`
for a list of available targets.
.. option:: -C, --demangle
Demangle symbol names in the output.
.. option:: --debug-vars=<format>
Print the locations (in registers or memory) of source-level variables
alongside disassembly. ``format`` may be ``unicode`` or ``ascii``, defaulting
to ``unicode`` if omitted.
.. option:: --debug-vars-indent=<width>
Distance to indent the source-level variable display, relative to the start
of the disassembly. Defaults to 52 characters.
.. option:: -j, --section=<section1[,section2,...]>
Perform commands on the specified sections only. For Mach-O use
`segment,section` to specify the section name.
.. option:: -l, --line-numbers
When disassembling, display source line numbers. Implies
:option:`--disassemble`.
.. option:: -M, --disassembler-options=<opt1[,opt2,...]>
Pass target-specific disassembler options. Available options:
* ``reg-names-std``: ARM only (default). Print in ARM 's instruction set documentation, with r13/r14/r15 replaced by sp/lr/pc.
* ``reg-names-raw``: ARM only. Use r followed by the register number.
* ``no-aliases``: AArch64 and RISC-V only. Print raw instruction mnemonic instead of pesudo instruction mnemonic.
* ``numeric``: RISC-V only. Print raw register names instead of ABI mnemonic. (e.g. print x1 instead of ra)
* ``att``: x86 only (default). Print in the AT&T syntax.
* ``intel``: x86 only. Print in the intel syntax.
.. option:: --mcpu=<cpu-name>
Target a specific CPU type for disassembly. Specify ``--mcpu=help`` to display
available CPUs.
.. option:: --mattr=<a1,+a2,-a3,...>
Enable/disable target-specific attributes. Specify ``--mattr=help`` to display
the available attributes.
.. option:: --no-leading-addr
When disassembling, do not print leading addresses.
.. option:: --no-print-imm-hex
Do not use hex format for immediate values in disassembly output (default).
.. option:: --no-show-raw-insn
When disassembling, do not print the raw bytes of each instruction.
.. option:: --prefix=<prefix>
When disassembling with the :option:`--source` option, prepend ``prefix`` to
absolute paths.
.. option:: --prefix-strip=<level>
When disassembling with the :option:`--source` option, strip out ``level``
initial directories from absolute paths. This option has no effect without
:option:`--prefix`.
.. option:: --print-imm-hex
Use hex format when printing immediate values in disassembly output.
.. option:: -S, --source
When disassembling, display source interleaved with the disassembly. Implies
:option:`--disassemble`.
.. option:: --show-lma
Display the LMA column when dumping ELF section headers. Defaults to off
unless any section has different VMA and LMAs.
.. option:: --start-address=<address>
When disassembling, only disassemble from the specified address.
When printing relocations, only print the relocations patching offsets from at least ``address``.
When printing symbols, only print symbols with a value of at least ``address``.
.. option:: --stop-address=<address>
When disassembling, only disassemble up to, but not including the specified address.
When printing relocations, only print the relocations patching offsets up to ``address``.
When printing symbols, only print symbols with a value up to ``address``.
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
.. option:: --symbolize-operands
When disassembling, symbolize a branch target operand to print a label instead of a real address.
When printing a PC-relative global symbol reference, print it as an offset from the leading symbol.
Only works with an X86 linked image.
Example:
A non-symbolized branch instruction with a local target and pc-relative memory access like
.. code-block:: none
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
cmp eax, dword ptr [rip + 4112]
jge 0x20117e <_start+0x25>
might become
.. code-block:: none
[llvm-objdump] Symbolize binary addresses for low-noisy asm diff. When diffing disassembly dump of two binaries, I see lots of noises from mismatched jump target addresses and global data references, which unnecessarily causes diffs on every function, making it impractical. I'm trying to symbolize the raw binary addresses to minimize the diff noise. In this change, a local branch target is modeled as a label and the branch target operand will simply be printed as a label. Local labels are collected by a separate pre-decoding pass beforehand. A global data memory operand will be printed as a global symbol instead of the raw data address. Unfortunately, due to the way the disassembler is set up and to be less intrusive, a global symbol is always printed as the last operand of a memory access instruction. This is less than ideal but is probably acceptable from checking code quality point of view since on most targets an instruction can have at most one memory operand. So far only the X86 disassemblers are supported. Test Plan: llvm-objdump -d --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 mov eax, dword ptr [rsp] cmp eax, dword ptr [rip + 4112] # 202182 <g> jge 0x20117e <_start+0x25> call 0x201158 <foo> inc dword ptr [rsp] jmp 0x201169 <_start+0x10> xor eax, eax pop rcx ret ``` llvm-objdump -d **--symbolize-operands** --x86-asm-syntax=intel --no-show-raw-insn --no-leading-addr : ``` Disassembly of section .text: <_start>: push rax mov dword ptr [rsp + 4], 0 mov dword ptr [rsp], 0 <L1>: mov eax, dword ptr [rsp] cmp eax, dword ptr <g> jge <L0> call <foo> inc dword ptr [rsp] jmp <L1> <L0>: xor eax, eax pop rcx ret ``` Note that the jump instructions like `jge 0x20117e <_start+0x25>` without this work is printed as a real target address and an offset from the leading symbol. With a change in the optimizer that adds/deletes an instruction, the address and offset may shift for targets placed after the instruction. This will be a problem when diffing the disassembly from two optimizers where there are unnecessary false positives due to such branch target address changes. With `--symbolize-operand`, a label is printed for a branch target instead to reduce the false positives. Similarly, the disassemble of PC-relative global variable references is also prone to instruction insertion/deletion. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D84191
2020-07-21 00:45:32 +08:00
<L0>:
cmp eax, dword ptr <g>
jge <L0>
.. option:: --triple=<string>
Target triple to disassemble for, see ``--version`` for available targets.
.. option:: -w, --wide
Ignored for compatibility with GNU objdump.
.. option:: --x86-asm-syntax=<style>
Deprecated.
When used with :option:`--disassemble`, choose style of code to emit from
X86 backend. Supported values are:
.. option:: att
AT&T-style assembly
.. option:: intel
Intel-style assembly
The default disassembly style is **att**.
.. option:: -z, --disassemble-zeroes
Do not skip blocks of zeroes when disassembling.
.. option:: @<FILE>
Read command-line options and commands from response file `<FILE>`.
MACH-O ONLY OPTIONS AND COMMANDS
--------------------------------
.. option:: --arch=<architecture>
Specify the architecture to disassemble. see ``--version`` for available
architectures.
.. option:: --archive-member-offsets
Print the offset to each archive member for Mach-O archives (requires
:option:`--archive-headers`).
.. option:: --bind
Display binding info
.. option:: --data-in-code
Display the data in code table.
.. option:: --dis-symname=<name>
Disassemble just the specified symbol's instructions.
.. option:: --dylibs-used
Display the shared libraries used for linked files.
.. option:: --dsym=<string>
Use .dSYM file for debug info.
.. option:: --dylib-id
Display the shared library's ID for dylib files.
.. option:: --exports-trie
Display exported symbols.
.. option:: --function-starts
Print the function starts table for Mach-O objects.
.. option:: -g
Print line information from debug info if available.
.. option:: --full-leading-addr
Print the full leading address when disassembling.
.. option:: --indirect-symbols
Display the indirect symbol table.
.. option:: --info-plist
Display the info plist section as strings.
.. option:: --lazy-bind
Display lazy binding info.
.. option:: --link-opt-hints
Display the linker optimization hints.
.. option:: -m, --macho
Use Mach-O specific object file parser. Commands and other options may behave
differently when used with ``--macho``.
.. option:: --no-leading-headers
Do not print any leading headers.
.. option:: --no-symbolic-operands
Do not print symbolic operands when disassembling.
.. option:: --non-verbose
Display the information for Mach-O objects in non-verbose or numeric form.
.. option:: --objc-meta-data
Display the Objective-C runtime meta data.
.. option:: --private-header
Display only the first format specific file header.
.. option:: --rebase
Display rebasing information.
.. option:: --rpaths
Display runtime search paths for the binary.
.. option:: --universal-headers
Display universal headers.
.. option:: --weak-bind
Display weak binding information.
XCOFF ONLY OPTIONS AND COMMANDS
---------------------------------
.. option:: --symbol-description
Add symbol description to disassembly output.
BUGS
----
2020-03-23 05:45:15 +08:00
To report bugs, please visit <https://bugs.llvm.org/>.
SEE ALSO
--------
:manpage:`llvm-nm(1)`, :manpage:`llvm-otool(1)`, :manpage:`llvm-readelf(1)`,
:manpage:`llvm-readobj(1)`