The BinaryHolder would query the archive member MemoryBuffer name
to check if the current open archive also contains the next requested
objectfile. This comparison was using a StringRef to a temporary
buffer. It only happened with fat archives. This commit adds long-lived
storage along with the MemoryBuffers for the fat archive filename.
The added test would fail during an ASAN build without the fix.
llvm-svn: 268924
r267249 removed the dual ARM/Thumb interface from MachOObjectFile,
simplifying llvm-dsymutil's code. This unfortunately also regressed
llvm-dsymutil's ability to select thumb slices, because the simplified
code was also dealing with the discrepency between the slice arch
(eg. armv7m) and the triple arch name (eg. thumbv7m).
llvm-svn: 268894
llvm-dsymutil used to create the temporary files in the output directory.
This works fine except when the output directory contains a '%' char, which
is then replaced by llvm::sys::fs::createUniqueFile() generating an invalid
path.
Just use the default temp dir for those files.
llvm-svn: 268304
Produce another specific error message for a malformed Mach-O file when a symbol’s
section index is more than the number of sections. The existing test case in test/Object/macho-invalid.test
for macho-invalid-section-index-getSectionRawName now reports the error with the message indicating
that a symbol at a specific index has a bad section index and that bad section index value.
Again converting interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. Where the existing code reported the error with a
string message or an error code it was converted to do the same.
Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values. So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
"// TODO: Actually report errors helpfully" and a call something like
consumeError(NameOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.
llvm-svn: 268298
Until PR27449 (https://llvm.org/bugs/show_bug.cgi?id=27449) is fixed in
clang this warning is pointless, since ASTFileSignatures will change
randomly when a module is rebuilt.
rdar://problem/25610919
llvm-svn: 267427
Only one consumer (llvm-objdump) actually cared about the fact that there were
two triples. Others were actively working around the fact that the Triple
returned by getArch might have been invalid. As for llvm-objdump, it needs to
be acutely aware of both Triples anyway, so being generic in the exposed API is
no benefit.
Also rename the version of getArch returning a Triple. Users were having to
pass an unwanted nullptr to disambiguate the two, which was nasty.
The only functional change here is that armv7m and armv7em object files no
longer crash llvm-objdump.
llvm-svn: 267249
Produce another specific error message for a malformed Mach-O file when a symbol’s
string index is past the end of the string table. The existing test case in test/Object/macho-invalid.test
for macho-invalid-symbol-name-past-eof now reports the error with the message indicating
that a symbol at a specific index has a bad sting index and that bad string index value.
Again converting interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. Where the existing code reported the error with a
string message or an error code it was converted to do the same. There is some
code for this that could be factored into a routine but I would like to leave that for
the code owners post-commit to do as they want for handling an llvm::Error. An
example of how this could be done is shown in the diff in
lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h which had a Check() routine
already for std::error_code so I added one like it for llvm::Error .
Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values. So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
“// TODO: Actually report errors helpfully” and a call something like
consumeError(NameOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.
Note there fixes needed to lld that goes along with this that I will commit right after this.
So expect lld not to built after this commit and before the next one.
llvm-svn: 266919
Produce the first specific error message for a malformed Mach-O file describing
the problem instead of the generic message for object_error::parse_failed of
"Invalid data was encountered while parsing the file”. Many more good error
messages will follow after this first one.
This is built on Lang Hames’ great work of adding the ’Error' class for
structured error handling and threading Error through MachOObjectFile
construction. And making createMachOObjectFile return Expected<...> .
So to to get the error to the llvm-obdump tool, I changed the stack of
these methods to also return Expected<...> :
object::ObjectFile::createObjectFile()
object::SymbolicFile::createSymbolicFile()
object::createBinary()
Then finally in ParseInputMachO() in MachODump.cpp the error can
be reported and the specific error message can be printed in llvm-objdump
and can be seen in the existing test case for the existing malformed binary
but with the updated error message.
Converting these interfaces to Expected<> from ErrorOr<> does involve
touching a number of places. To contain the changes for now use of
errorToErrorCode() and errorOrToExpected() are used where the callers
are yet to be converted.
Also there some were bugs in the existing code that did not deal with the
old ErrorOr<> return values. So now with Expected<> since they must be
checked and the error handled, I added a TODO and a comment:
“// TODO: Actually report errors helpfully” and a call something like
consumeError(ObjOrErr.takeError()) so the buggy code will not crash
since needed to deal with the Error.
Note there is one fix also needed to lld/COFF/InputFiles.cpp that goes along
with this that I will commit right after this. So expect lld not to built
after this commit and before the next one.
llvm-svn: 265606
in the test suite. While this is not really an interesting tool and option to run
on a Mach-O file to show the symbol table in a generic libObject format
it shouldn’t crash.
The reason for the crash was in MachOObjectFile::getSymbolType() when it was
calling MachOObjectFile::getSymbolSection() without checking its return value
for the error case.
What makes this fix require a fair bit of diffs is that the method getSymbolType() is
in the class ObjectFile defined without an ErrorOr<> so I needed to add that all
the sub classes. And all of the uses needed to be updated and the return value
needed to be checked for the error case.
The MachOObjectFile version of getSymbolType() “can” get an error in trying to
come up with the libObject’s internal SymbolRef::Type when the Mach-O symbol
symbol type is an N_SECT type because the code is trying to select from the
SymbolRef::ST_Data or SymbolRef::ST_Function values for the SymbolRef::Type.
And it needs the Mach-O section to use isData() and isBSS to determine if
it will return SymbolRef::ST_Data.
One other possible fix I considered is to simply return SymbolRef::ST_Other
when MachOObjectFile::getSymbolSection() returned an error. But since in
the past when I did such changes that “ate an error in the libObject code” I
was asked instead to push the error out of the libObject code I chose not
to implement the fix this way.
As currently written both the COFF and ELF versions of getSymbolType()
can’t get an error. But if isReservedSectionNumber() wanted to check for
the two known negative values rather than allowing all negative values or
the code wanted to add the same check as in getSymbolAddress() to use
getSection() and check for the error then these versions of getSymbolType()
could return errors.
At the end of the day the error printed now is the generic “Invalid data was
encountered while parsing the file” for object_error::parse_failed. In the
future when we thread Lang’s new TypedError for recoverable error handling
though libObject this will improve. And where the added // Diagnostic(…
comment is, it would be changed to produce and error message
like “bad section index (42) for symbol at index 8” for this case.
llvm-svn: 264187
Now that the resolved path cache stores the StringRef's, its
best to just always cache the results, even when realpath isn't
used. This way we'll still avoid the StringMap hashing and lookup.
This also conveniently reorganises this code in a way I need for
a future patch.
llvm-svn: 263777
ResolvedPaths was storing std::string's as a cache. We would then take those strings and look them up in the internString pool to get a unique StringRef for each path.
This patch changes ResolvedPaths to store the StringRef pointing in to the internString pool itself. This way, when getResolvedPath returns a string, we know we have the StringRef we would find in the pool anyway. We can avoid the duplicate memory of the std::string's, and also the time from the lookup.
Unfortunately my profiles show no runtime change here, but it should still save memory allocations which is nice.
Reviewed by Frederic Riss.
Differential Revision: http://reviews.llvm.org/D18259
llvm-svn: 263774
Noticed while working on scattered relocations.
I do not think these relocs can actually happen in the debug_info section,
but if they happen the code would mishandle them. Explicitely skip them
and warn if we encounter one.
llvm-svn: 259341
Although it seems like clang will never emit scattered relocations in
the debug information (at least I couldn't find a way), we have too
support them for the benefit of other compilers.
As clang doesn't generate them, the included testcase was produced
from hacked up assembly.
llvm-svn: 259339
llvm-dsymutil was misinterpreting the value of common symbols as their
address when it actually contains their size. This didn't impact
llvm-dsymutil's ability to link the debug information for common symbols
because these are always found by name and not by address. Things could
however go wrong when the size of a common object matched the object
file address of another symbol. Depending on the link order of the symbols
the common object might incorrectly evict this other object from the
address to symbol mapping, and then link the evicted symbol with a wrong
binary address.
Use the new ability to have symbols without an object file address to fix
this.
llvm-svn: 259318
This change just changes the data structure that ties symbol names,
object file address and linked binary addresses to accept mappings
with no object file address. Such symbol mappings are not fed into
the debug map yet, so this patch is NFC.
A subsequent patch will make use of this functionality for common
symbols.
llvm-svn: 259317
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html
"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi
Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark
Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D16471
llvm-svn: 258861
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker. This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).
Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.
llvm-svn: 256203
Quoting from the comment added to the code:
// Objective-C on i386 uses artificial absolute symbols to
// perform some link time checks. Those symbols have a fixed 0
// address that might conflict with real symbols in the object
// file. As I cannot see a way for absolute symbols to find
// their way into the debug information, let's just ignore those.
llvm-svn: 255350
While still allowing CodeGen/AsmPrinter in llvm to own them using a bump
ptr allocator. (might be nice to replace the pointers there with
something that at least automatically calls their dtors, if that's
necessary/useful, rather than having it done explicitly (I think a typed
BumpPtrAllocator already does this, or maybe a unique_ptr with a custom
deleter, etc))
llvm-svn: 253409
already emitted and fix a latent bug in DIECloner where the DW_CHILDREN_yes
flag is set based on the number of children in the input DIE rather than
the number of children that are actually being cloned.
rdar://problem/23439845
llvm-svn: 252649
The needed lld matching changes to be submitted immediately next,
but this revision will cause lld failures with this alone which is expected.
This removes the eating of the error in Archive::Child::getSize() when the characters
in the size field in the archive header for the member is not a number. To do this we
have all of the needed methods return ErrorOr to push them up until we get out of lib.
Then the tools and can handle the error in whatever way is appropriate for that tool.
So the solution is to plumb all the ErrorOr stuff through everything that touches archives.
This include its iterators as one can create an Archive object but the first or any other
Child object may fail to be created due to a bad size field in its header.
Thanks to Lang Hames on the changes making child_iterator contain an
ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add
operator overloading for * and -> .
We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash”
and using report_fatal_error() to move the error checking will cause the program to
stop, neither of which are really correct in library code. There are still some uses of
these that should be cleaned up in this library code for other than the size field.
The test cases use archives with text files so one can see the non-digit character,
in this case a ‘%’, in the size field.
These changes will require corresponding changes to the lld project. That will be
committed immediately after this change. But this revision will cause lld failures
with this alone which is expected.
llvm-svn: 252192
in the size field in the archive header for the member is not a number. To do this we
have all of the needed methods return ErrorOr to push them up until we get out of lib.
Then the tools and can handle the error in whatever way is appropriate for that tool.
So the solution is to plumb all the ErrorOr stuff through everything that touches archives.
This include its iterators as one can create an Archive object but the first or any other
Child object may fail to be created due to a bad size field in its header.
Thanks to Lang Hames on the changes making child_iterator contain an
ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add
operator overloading for * and -> .
We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash”
and using report_fatal_error() to move the error checking will cause the program to
stop, neither of which are really correct in library code. There are still some uses of
these that should be cleaned up in this library code for other than the size field.
Also corrected the code where the size gets us to the “at the end of the archive”
which is OK but past the end of the archive will return object_error::parse_failed now.
The test cases use archives with text files so one can see the non-digit character,
in this case a ‘%’, in the size field.
llvm-svn: 250906
Even if we don't have it in PATH, lipo should usually exist in the same directory
as dsymutil. Keep the fallback looking up the PATH, it's very useful when
testing a non-installed executable.
llvm-svn: 249762
if there exists not definition for the type.
For this to work, we need to clone the imported modules before building
the decl context chains of the DIEs in the non-skeleton CUs.
llvm-svn: 249362
This patch extends llvm-dsymutil's ODR type uniquing machinery to also
resolve forward decls for types defined in clang modules.
http://reviews.llvm.org/D13038
llvm-svn: 248398
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247692
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).
For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.
This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.
This commit also contains a trivial patch to clang to account for the C++ API
change.
Reviewers: rengolin
Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10969
llvm-svn: 247683
When cloning the debug info for a function that hasn't been linked,
strip the DIEs from all location attributes that wouldn't contain any
meaningful information anyway.
This kind of situation can happen when a function got discarded by the
linker, but its debug information is still wanted in the final link
because it was marked as required as some other DIE dependency. The easiest
way to get into that situation is to have using directives. They get
linked unconditionally, but their targets might not always be present.
llvm-svn: 247386
lldb doesn't like having variables named as an existing type. In order to
ease debugging, rename those variables to avoid that conflict.
llvm-svn: 247385
With a fix for big endian machines. Thanks to Daniel Sanders for the debugging!
Original commit message:
The binaries containing the linked DWARF generated by dsymutil are not
standard relocatable object files like emitted did previsously. They should be
dSYM companion files, which means they have a different file type in the
header, but also a couple other peculiarities:
- they contain the segments and sections from the original binary in their
load commands, but not the actual contents. This means they get an address
and a size, but their offset is always 0 (but these are not virtual sections)
- they also conatin all the defined symbols from the original binary
This makes MC a really bad fit to emit these kind of binaries. The approach
that was used in this patch is to leverage MC's section layout for the
debug sections, but to use a replacement for MachObjectWriter that lives
in MachOUtils.cpp. Some of the low-level helpers from MachObjectWriter
were reused too.
llvm-svn: 246673
The fix is trivial (The actual patch is 2 lines, but as it changes
indentation it looks like more).
clang does not produce this kind of (slightly bogus) debug info
anymore, thus I had to rely on a hand-crafted assembly test to trigger
that case.
llvm-svn: 246410
The value of an inlined subprogram low_pc attribute should not
get relocated, but it can happen that it matches the enclosing
function's start address and thus gets the generic treatment.
Special case it to avoid applying the PC offset twice.
llvm-svn: 246406
The binaries containing the linked DWARF generated by dsymutil are not
standard relocatable object files like emitted did previsously. They should be
dSYM companion files, which means they have a different file type in the
header, but also a couple other peculiarities:
- they contain the segments and sections from the original binary in their
load commands, but not the actual contents. This means they get an address
and a size, but their offset is always 0 (but these are not virtual sections)
- they also conatin all the defined symbols from the original binary
This makes MC a really bad fit to emit these kind of binaries. The approach
that was used in this patch is to leverage MC's section layout for the
debug sections, but to use a replacement for MachObjectWriter that lives
in MachOUtils.cpp. Some of the low-level helpers from MachObjectWriter
were reused too.
llvm-svn: 246012
llvm-dsymutil needs to emit dSYM companion bundles. These are binary files
that replicate some of the orignal binary file properties (sections and
symbols). To get acces to these properties, pass the binary path in the
debug map.
llvm-svn: 246011
There was an issue in the test setup because the test requires an arch that
wasn't filtered by the lit.local.cfg, but given the set of bots that failed,
I'm not confident this is the (only) issue. So this commit also adds more
output to the test to help me track down the failure if it happens again.
Original commit message:
[dsymutil] Rewrite thumb triple names in user visible messages.
We autodetect triples from the input file(s) while reading the mach-o debug map.
As we need to create a Target from those triples, we always chose the thumb
variant (because the arm variant might not be 'instantiable' eg armv7m). The
user visible architecture names should still be 'arm' and not 'thumb' variants
though.
llvm-svn: 245988
This reverts commit r245960.
Multiple bots are failing on the new test. It seemd like llvm-dsymutil exits with an error. Investigating.
llvm-svn: 245964
We autodetect triples from the input file(s) while reading the mach-o debug map.
As we need to create a Target from those triples, we always chose the thumb
variant (because the arm variant might not be 'instantiable' eg armv7m). The
user visible architecture names should still be 'arm' and not 'thumb' variants
though.
llvm-svn: 245960
Seq.emplace_back(Seq.back());
does not work as planned, since Seq.back() may become a dangling reference
when emplace_back is called and possibly reallocates vector. To avoid this,
the vector allocation should be reserved first and only then used.
This broke test/tools/dsymutil/X86/custom-line-table.test with Visual C++ 2013.
llvm-svn: 244405
llvm-dsymutil has to be able to process debug info produced by other compilers
which use different line table settings. The testcase wasn't generated by
another compiler, but by a modified clang.
llvm-svn: 244319
A dSYM bundle is a file hierarchy that looks slike this:
<bundle name>.dSYM/
Contents/
Info.plist
Resources/
DWARF/
<DWARF file(s)>
This is the default output mode of dsymutil.
llvm-svn: 244270
dsymutil should by default generate dSYM bundles which are filesystem
hierarchies containing the debug info and an additional Info.plist.
Currently llvm-dsymutil emits raw binaries containing the debug info.
This is what we call the 'flat mode'. Add a -f/-flat option that is
supposed to enable that flat mode, but don't wire it for now, only
pass it to the tests that will need it to stay functional once we
do bundle generation by default.
This basically makes this commit NFC and removes the noise from the
actual commit that adds support for bundle generation.
llvm-svn: 244269
The files were never written to and then deleted, but they were created
nonetheless. To prevent that, create a wrapper around the 2 variants of
createUniqueFile and use the one that only does an access(Exists) call
to check for name unicity in -no-output mode.
llvm-svn: 244172
This option allows to select a subset of the architectures when
performing a universal binary link. The filter is done completely
in the mach-o specific part of the code.
llvm-svn: 244160
The DWARF linker isn't touched by this, the implementation links
individual files and merges them together into a fat binary by
calling out to the 'lipo' utility.
The main change is that the MachODebugMapParser can now return
multiple debug maps for a single binary.
The test just verifies that lipo would be invoked correctly, but
doesn't actually generate a binary. This mimics the way clang
tests its external iplatform tools integration.
llvm-svn: 244087
llvm-dsymutil will start creating temporary files in a followup
commit. To ease the correct cleanup of this files, introduce a
helper called to exit dsymutil.
llvm-svn: 244086
Prevent all the unrelated LLVM options to appear in the -help output
by introducing a tool specific option category. As a drive-by improve
the wording of the help message.
llvm-svn: 243583
The dsymutil-classic -v option dumps the tool version rather than
putting it in verbose mode. Rename -v to -verbose and update the
tests that use it (in the process removing it from a few tests that
didn't require it anymore since the -dump-debug-map option was
introduced).
A followup commit will reintroduce the -v option that dumps the
version.
llvm-svn: 243582
This patch allows llvm-dsymutil to read universal (aka fat) macho object
files and archives. The patch touches nearly everything in the BinaryHolder,
but it is fairly mechinical: the methods that returned MemoryBufferRefs or
ObjectFiles now return a vector of those, and the high-level access function
takes a triple argument to select the architecture.
There is no support yet for handling fat executables and thus no support for
writing fat object files.
llvm-svn: 243096
MachOObjectFile offers a method for detecting the correct triple, use
it instead of the previous approximation. This doesn't matter right
now, but it will become important for mach-o universal (fat) binaries.
llvm-svn: 243095
Call a helper that resets all the internal state of the BinaryHolder
when we change the underlying memory buffer. Makes a followup patch
a tiny bit smaller.
llvm-svn: 243094
The debug map contains the timestamp of the object files in references.
We do not check these in the general case, but it's really useful if
you have archives where different versions of an object file have been
appended. This allows llvm-dsymutil to find the right one.
llvm-svn: 242965
This optimization allows the DWARF linker to reuse definition of
types it has emitted in previous CUs rather than reemitting them
in each CU that references them. The size and link time gains are
huge. For example when linking the DWARF for a debug build of
clang, this generates a ~150M dwarf file instead of a ~700M one
(the numbers date back a bit and must not be totally accurate
these days).
As with all the other parts of the llvm-dsymutil codebase, the
goal is to keep bit-for-bit compatibility with dsymutil-classic.
The code is littered with a lot of FIXMEs that should be
addressed once we can get rid of the compatibilty goal.
llvm-svn: 242847
getSymbolValue now returns a value that in convenient for most callers:
* 0 for undefined
* symbol size for common symbols
* offset/address for symbols the rest
Code that needs something more specific can check getSymbolFlags.
llvm-svn: 241605