options other than just -disassemble so that universal files can be used with other
options combined with -arch options.
No functional change to existing options and use. One test case added for the
additional functionality with a universal file an a -arch option.
llvm-svn: 225383
Also corrected the name of the load command to not end in an ’S’ as well as corrected
the name of the MachO::linker_option_command struct and other places that had the
word option as plural which did not match the Mac OS X headers.
llvm-svn: 224485
with fixes. Includes the move of tests for llvm-objdump for universal files to an X86
directory. And the fix where it was failing on linux Rafael tracked down with asan.
I had both Jim Grosbach and Adam Hemet look over the second fix since I could not
set up asan to reproduce with the old version but not with the fix.
llvm-svn: 223416
Summary: Add rpath load command support in Mach-O object and update llvm-objdump to use it.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6512
llvm-svn: 223343
llvm-objdump printed out an error message for this off-by-one error,
but because it always exits with 0 whether or not it found an error,
the test (llvm-objdump/coff-many-relocs.test) succeeded.
I made llvm-objdump exit with EXIT_FAILURE when an error is found.
llvm-svn: 222852
We were a little lax in a few areas:
- We pretended that import libraries were like any old COFF file, they
are not. In fact, they aren't really COFF files at all, we should
probably grow some specialized functionality to handle them smarter.
- Our symbol iterators were more than happy to attempt to go past the
end of the symbol table if you had a symbol with a bad list of
auxiliary symbols.
llvm-svn: 222124
FYI, removed the unused MCInstrAnalysis as it does not exist for 64-bit ARM and
was causing a “couldn't initialize disassembler for target” error.
llvm-svn: 222045
With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t>
instead of a MemoryObject.
Even on X86 there is a maximum size an instruction can have. Given
that, it seems way simpler and more efficient to just pass an ArrayRef
to the disassembler instead of a MemoryObject and have it do a virtual
call every time it wants some extra bytes.
llvm-svn: 221751
add the code and test cases for 32-bit ARM symbolizer.
Also fixed the printing of data in code as it was not using the table correctly
and needed to fix one of the test cases too.
This will break lld’s test/mach-o/arm-interworking-movw.yaml till the tweak
for that is made. Which I’ll be committing immediately after this commit.
llvm-svn: 221470
There are two methods in SectionRef that can fail:
* getName: The index into the string table can be invalid.
* getContents: The section might point to invalid contents.
Every other method will always succeed and returning and std::error_code just
complicates the code. For example, a section can have an invalid alignment,
but if we are able to get to the section structure at all and create a
SectionRef, we will always be able to read that invalid alignment.
llvm-svn: 219314
So in fully linked images when a call is made through a stub it now gets a
comment like the following in the disassembly:
callq 0x100000f6c ## symbol stub for: _printf
indicating the call is to a symbol stub and which symbol it is for. This is
done for branch reference types and seeing if the branch target is in a stub
section and if so using the indirect symbol table entry for that stub and
using that symbol table entries symbol name.
llvm-svn: 218546
get the literal string “Hello world” printed as a comment on the instruction
that loads the pointer to it. For now this is just for x86_64. So for object
files with relocation entries it produces things like:
leaq L_.str(%rip), %rax ## literal pool for: "Hello world\n"
and similar for fully linked images like executables:
leaq 0x4f(%rip), %rax ## literal pool for: "Hello world\n"
Also to allow testing against darwin’s otool(1), I hooked up the existing
-no-show-raw-insn option to the Mach-O parser code, added the new Mach-O
only -full-leading-addr option to match otool(1)'s printing of addresses and
also added the new -print-imm-hex option.
llvm-svn: 218423
First step done in this commit is to get flush out enough of the
SymbolizerGetOpInfo() routine to symbolic an X86_64 hello world .o and
its loading of the literal string and call to printf. Also the code to
symbolicate the X86_64_RELOC_SUBTRACTOR relocation and a test is also
added to show a slightly more complicated case.
Next will be to flush out enough of SymbolizerSymbolLookUp() to get the
literal string “Hello world” printed as a comment on the instruction that load
the pointer to it.
llvm-svn: 217893
This finishes the ability of llvm-objdump to print out all information from
the LC_DYLD_INFO load command.
The -bind option prints out symbolic references that dyld must resolve
immediately.
The -lazy-bind option prints out symbolc reference that are lazily resolved on
first use.
The -weak-bind option prints out information about symbols which dyld must
try to coalesce across images.
llvm-svn: 217853
Teach WinCOFFObjectWriter how to write -mbig-obj style object files;
these object files allow for more sections inside an object file.
Our support for BigObj is notably different from binutils and cl: we
implicitly upgrade object files to BigObj instead of asking the user to
compile the same file *again* but with another flag. This matches up
with how LLVM treats ELF variants.
This was tested by forcing LLVM to always emit BigObj files and running
the entire test suite. A specific test has also been added.
I've lowered the maximum number of sections in a normal COFF file,
VS "14" CTP 3 supports no more than 65279 sections. This is important
otherwise we might not switch to BigObj quickly enough, leaving us with
a COFF file that we couldn't link.
yaml2obj support is all that remains to implement.
Differential Revision: http://reviews.llvm.org/D5349
llvm-svn: 217812
Similar to my previous -exports-trie option, the -rebase option dumps info from
the LC_DYLD_INFO load command. The rebasing info is a list of the the locations
that dyld needs to adjust if a mach-o image is not loaded at its preferred
address. Since ASLR is now the default, images almost never load at their
preferred address, and thus need to be rebased by dyld.
llvm-svn: 217709
This adds support for reading the "bigobj" variant of COFF produced by
cl's /bigobj and mingw's -mbig-obj.
The most significant difference that bigobj brings is more than 2**16
sections to COFF.
bigobj brings a few interesting differences with it:
- It doesn't have a Characteristics field in the file header.
- It doesn't have a SizeOfOptionalHeader field in the file header (it's
only used in executable files).
- Auxiliary symbol records have the same width as a symbol table entry.
Since symbol table entries are bigger, so are auxiliary symbol
records.
Write support will come soon.
Differential Revision: http://reviews.llvm.org/D5259
llvm-svn: 217496
The code is buggy and barely tested. It is also mostly boilerplate.
(This includes MCObjectDisassembler, which is the interface to that
functionality)
Following an IRC discussion with Jim Grosbach, it seems sensible to just
nuke the whole lot of functionality, and dig it up from VCS if
necessary (I hope not!).
All of this stuff appears to have been added in a huge patch dump (look
at the timeframe surrounding e.g. r182628) where almost every patch
seemed to be untested and not reviewed before being committed.
Post-review responses to the patches were never addressed. I don't think
any of it would have passed pre-commit review.
I doubt anyone is depending on this, since this code appears to be
extremely buggy. In limited testing that Michael Spencer and I did, we
couldn't find a single real-world object file that wouldn't crash the
CFG reconstruction stuff. The symbolizer stuff has O(n^2) behavior and
so is not much use to anyone anyway. It seemed simpler to remove them as
a whole. Most of this code is boilerplate, which is the only way it was
able to scrape by 60% coverage.
HEADSUP: Modules folks, some files I nuked were referenced from
include/llvm/module.modulemap; I just deleted the references. Hopefully
that is the right fix (one was a FIXME though!).
llvm-svn: 216983
MachOObjectFile in lib/Object currently has no support for parsing the rebase,
binding, and export information from the LC_DYLD_INFO load command in final
linked mach-o images. This patch adds support for parsing the exports trie data
structure. It also adds an option to llvm-objdump to dump that export info.
I did the exports parsing first because it is the hardest. The information is
encoded in a trie structure, but the standard ObjectFile way to inspect content
is through iterators. So I needed to make an iterator that would do a
non-recursive walk through the trie and maintain the concatenation of edges
needed for the current string prefix.
I plan to add similar support in MachOObjectFile and llvm-objdump to
parse/display the rebasing and binding info too.
llvm-svn: 216808
Take a StringRef instead of a "const char *".
Take a "std::error_code &" instead of a "std::string &" for error.
A create static method would be even better, but this patch is already a bit too
big.
llvm-svn: 216393
The switch statement would never fire due to the preceding break statement. Also, the switch statement has a default label with no case labels. Simplified the code, and allow it to execute.
llvm-svn: 216346
Owning the buffer is somewhat inflexible. Some Binaries have sub Binaries
(like Archive) and we had to create dummy buffers just to handle that. It is
also a bad fit for IRObjectFile where the Module wants to own the buffer too.
Keeping this ownership would make supporting IR inside native objects
particularly painful.
This patch focuses in lib/Object. If something elsewhere used to own an Binary,
now it also owns a MemoryBuffer.
This patch introduces a few new types.
* MemoryBufferRef. This is just a pair of StringRefs for the data and name.
This is to MemoryBuffer as StringRef is to std::string.
* OwningBinary. A combination of Binary and a MemoryBuffer. This is needed
for convenience functions that take a filename and return both the
buffer and the Binary using that buffer.
The C api now uses OwningBinary to avoid any change in semantics. I will start
a new thread to see if we want to change it and how.
llvm-svn: 216002
file with -macho, the Mach-O specific object file parser option.
After some discussion I chose to do this implementation contained in the logic
of llvm-objdump’s MachODump.cpp using a second disassembler for thumb when
needed and with updates mostly contained in the MachOObjectFile class.
llvm-svn: 215931
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
ARM bots (& others, I think, now that I look) were failing because we
were using incorrect printf-style format specifiers. They were wrong
on almost any platform, actually, just mostly harmlessly so.
llvm-svn: 215196
Also make the disassembler created with the Mach-O parser (the -m option)
pick up the Target specific attributes specified with -mattr option.
llvm-svn: 215032
The size of the uninitialized sections, like BSS, can exceed the size of
the object file.
Do not attempt to grab the contents of such sections.
llvm-svn: 212953
The new library is 150KB on a Release+Asserts build, so it is quiet a bit of
code that regular users of MC don't need to link with now.
llvm-svn: 212209
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.
small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.
This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.
The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.
llvm-svn: 211749
This makes the buffer ownership on error conditions very natural. The buffer
is only moved out of the argument if an object is constructed that now
owns the buffer.
llvm-svn: 211546
The idea of this patch is to turn llvm/Support/system_error.h into a
transitional header that just brings in the erorr_code api to the llvm
namespace. I will remove it shortly afterwards.
The cases where the general idea needed some tweaking:
* std::errc is a namespace in msvc, so we cannot use "using std::errc". I could
add an #ifdef, but there were not that many uses, so I just added std:: to
them in this patch.
* Template specialization had to be moved to the std namespace in this
patch set already.
* The msvc implementation of default_error_condition doesn't seem to
provide the same transformations as we need. Not too surprising since
the standard doesn't actually say what "equivalent" means. I fixed the
problem by keeping our old mapping and using it at error_code
construction time.
Despite these shortcomings I think this is still a good thing. Some reasons:
* The different implementations of system_error might improve over time.
* It removes 925 lines of code from llvm already.
* It removes 6313 bytes from the text segment of the clang binary when
it is built with gcc and 2816 bytes when building with clang and
libstdc++.
llvm-svn: 210687
Immutable DILineInfo doesn't bring any benefits and complicates
code. Also, use std::string instead of SmallString<16> for file
and function names - their length can vary significantly.
No functionality change.
llvm-svn: 206654
Since LLVM currently only supports WinCOFF, assume that the input is WinCOFF
rather than another type of COFF file (ECOFF/XCOFF). If the architecture is
detected as thumb (e.g. the file has a IMAGE_FILE_MACHINE_ARMNT magic) then use
a triple of thumbv7-windows.
This allows for objdump to properly handle WoA object files without having to
specify the target triple manually.
llvm-svn: 206446
This patch re-introduces the MCContext member that was removed from
MCDisassembler in r206063, and requires that an MCContext be passed in at
MCDisassembler construction time. (Previously the MCContext member had been
initialized in an ad-hoc fashion after construction). The MCCContext member
can be used by MCDisassembler sub-classes to construct constant or
target-specific MCExprs.
This patch updates disassemblers for in-tree targets, and provides the
MCRegisterInfo instance that some disassemblers were using through the
MCContext (previously those backends were constructing their own
MCRegisterInfo instances).
llvm-svn: 206241
Once the auxiliary fields relating to the filename have been inspected, any
following auxiliary fields need not be visited as they have been consumed (the
following fields comprise the filepath as a single unit).
Adjust the test to catch this even if ASAN is not enabled.
llvm-svn: 206190
Rather than switching behaviour on whether a previous symbol has an auxiliary
symbol record for the next count of elements, simply iterate over the auxiliary
symbols right after processing the current symbol entry. This makes the
behaviour much simpler to follow and similar to llvm-readobj and yaml2obj.
llvm-svn: 206146
If a filename is a multiple of 18 characters, there will be no null-terminator.
This will result in an invalid access by the constructed StringRef. Add a test
case to exercise this and fix that handling. Address this same vulnerability in
llvm-readobj as well.
llvm-svn: 206145
The auxiliary file records are contiguous and only contain the filename.
Construct a StringRef directly rather than copying to a temporary buffer.
Suggested by majnemer on IRC!
llvm-svn: 206139
Add support for file auxiliary symbol entries in COFF symbol tables. A COFF
symbol table with a FILE entry is followed by sizeof(__FILE__) / 18 auxiliary
symbol records which contain the filename. Read them and form the original
filename that the record contains. Then display the name in the output.
llvm-svn: 206126
The current state of affairs has auxiliary symbols described as a big
bag of bytes. This is less than satisfying, it detracts from the YAML
file as being human readable.
Instead, allow for symbols to optionally contain their auxiliary data.
This allows us to have a much higher level way of describing things like
weak symbols, function definitions and section definitions.
This depends on D3105.
Differential Revision: http://llvm-reviews.chandlerc.com/D3092
llvm-svn: 204214
This is a preliminary setup change to support a renaming of Windows target
triples. Split the object file format information out of the environment into a
separate entity. Unfortunately, file format was previously treated as an
environment with an unknown OS. This is most obvious in the ARM subtarget where
the handling for macho on an arbitrary platform switches to AAPCS rather than
APCS (as per Apple's needs).
llvm-svn: 203160
This compiles with no changes to clang/lld/lldb with MSVC and includes
overloads to various functions which are used by those projects and llvm
which have OwningPtr's as parameters. This should allow out of tree
projects some time to move. There are also no changes to libs/Target,
which should help out of tree targets have time to move, if necessary.
llvm-svn: 203083
Unwind info contents were indented at the same level as function table
contents. That's a bit confusing because the unwind info is pointed by
function table. In other places we usually increment indentation depth
by one when dereferncing a pointer.
This patch also removes extraneous newlines between function tables.
llvm-svn: 202879
The original code does not work correctly on executable files because the
code is written in such a way that only object files are assumed to be given
to llvm-objdump.
Contents of RuntimeFunction are different between executables and objects. In
executables, fields in RuntimeFunction have actual addresses to unwind info
structures. On the other hand, in object files, the fields have zero value,
but instead there are relocations pointing to the fields, so that Linker will
fill them at link-time.
So, when we are reading an object file, we need to use relocation info to
find the location of unwind info. When executable, we should just look at the
values in RuntimeFunction.
llvm-svn: 202785
The current COFF unwind printer tries to print SEH handler function names,
assuming that it can always find function names in string table. It crashes
if file being read has no symbol table (i.e. executable).
With this patch, llvm-objdump prints SEH handler's RVA if there's no symbol
table entry for that RVA.
llvm-svn: 202466
boundaries.
It is possible to create an ELF executable where symbol from say .text
section 'points' to the address outside the section boundaries. It does
not have a sense to disassemble something outside the section.
Without this fix llvm-objdump prints finite or infinite (depends on
the executable file architecture) number of 'invalid instruction
encoding' warnings.
llvm-svn: 202083
After this I will set the default back to F_None. The advantage is that
before this patch forgetting to set F_Binary would corrupt a file on windows.
Forgetting to set F_Text produces one that cannot be read in notepad, which
is a better failure mode :-)
llvm-svn: 202052
Load Configuration Table may contain a pointer to SEH table. This patch is to
print the offset to the table. Printing SEH table contents is a TODO.
The layout of Layout Configuration Table is described in Microsoft PE/COFF
Object File Format Spec, but the table's offset/size descriptions seems to be
totally wrong, at least in revision 8.3 of the spec. I believe the table in
this patch is the correct one.
llvm-svn: 201638
None of the object file formats reported error on iterator increment. In
retrospect, that is not too surprising: no object format stores symbols or
sections in a linked list or other structure that requires chasing pointers.
As a consequence, all error checking can be done on begin() and end().
This reduces the text segment of bin/llvm-readobj in my machine from 521233 to
518526 bytes.
llvm-svn: 200442
This fixes a regression introduced by r182908, which broke
llvm-objdump's ability to display relocations inline in a disassembly
dump for ELF object files.
That change removed a SectionRelocMap from Object/ELF.h, which we
recreate in llvm-objdump.cpp.
I discovered this regression via an out-of-tree test
(test/NaCl/X86/pnacl-hides-sandbox-x86-64.ll) which used llvm-objdump.
Note that the "Unknown" string in the test output on i386 isn't quite
right, but this appears to be a pre-existing bug.
Differential Revision: http://llvm-reviews.chandlerc.com/D2559
llvm-svn: 200090
The constructors of classes deriving from Binary normally take an error_code
as an argument to the constructor. My original intent was to change them
to have a trivial constructor and move the initial parsing logic to a static
method returning an ErrorOr. I changed my mind because:
* A constructor with an error_code out parameter is extremely convenient from
the implementation side. We can incrementally construct the object and give
up when we find an error.
* It is very efficient when constructing on the stack or when there is no
error. The only inefficient case is where heap allocating and an error is
found (we have to free the memory).
The result is that this is a much smaller patch. It just standardizes the
create* helpers to return an ErrorOr.
Almost no functionality change: The only difference is that this found that
we were trying to read past the end of COFF import library but ignoring the
error.
llvm-svn: 199770