In PDBs, symbol records must be aligned to four bytes. However, in the
object file, symbol records may not be aligned. MSVC does not pad out
symbol records to make sure they are aligned. That means the linker has
to do extra work to insert the padding. Currently, LLD calculates the
required space with alignment, and copies each record one at a time
while padding them out to the correct size. It has a fast path that
avoids this copy when the records are already aligned.
This change fixes a bug in that codepath so that the copy is actually
saved, and tweaks LLVM's symbol record emission to align symbol records.
Here's how things compare when doing a plain clang Release+PDB build:
- objs are 0.65% bigger (negligible)
- link is 3.3% faster (negligible)
- saves allocating 441MB
- new LLD high water mark is ~1.05GB
llvm-svn: 349431
The code here wants the output section offset of the instruction
requiring the errata patch, not the virtual address. Without this
change we can end up placing a patch out of range if the virtual
address of the code section is large enough.
Differential Revision: https://reviews.llvm.org/D55732
llvm-svn: 349386
ARM Architecture v6m is used by the smallest microcontrollers such as the
cortex-m0. It is Thumb only (no Thumb 2) which prevents it from using the
existing Thumb 2 range extension thunks as these use the Thumb 2 movt/movw
instructions. Range extension thunks are not usually needed for
microcontrollers due to the small amount of flash and ram on the device,
however if code is copied from flash into ram then a range extension thunk
is required to call that code.
This change adds support for v6m range extension thunks. The procedure call
standard APCS permits a thunk to corrupt the intra-procedural scratch
register r12 (referred to as ip in the APCS). Most Thumb instructions do
not permit access to high registers (r8 - r15) so the thunks must spill
some low registers (r0 - r7) to perform the control transfer.
Fixes pr39922
Differential Revision: https://reviews.llvm.org/D55555
llvm-svn: 349337
Previously we considered R_ARM_V4BX to be an absolute relocation,
which meant that we rejected it in read-only sections in PIC output
files. Instead, treat it as a hint relocation so that relocation
processing ignores it entirely.
Also fix a problem with the test case where it was never being run
because it has a .yaml extension and we don't run tests with that
extension.
Differential Revision: https://reviews.llvm.org/D55728
llvm-svn: 349216
Capture the stderr from 'tar --version' call as otherwise error messages
spill onto user's terminal unnecessarily (e.g. on NetBSD where tar does
not support long options). While at it, refactor the code to use
communicate() instead of reinventing the wheel.
Differential Revision: https://reviews.llvm.org/D55443
llvm-svn: 349204
`--plugin-opt=emit-llvm` is an option for LTO. It makes the linker to
combine all bitcode files and write the result to an output file without
doing codegen. Gold LTO plugin has this option.
This option is being used for some post-link code analysis tools that
have to see a whole program but don't need to see them in the native
machine code.
Differential Revision: https://reviews.llvm.org/D55717
llvm-svn: 349198
When calling BinaryStreamArray::drop_front(), if the stream
is skewed it means we must never drop the first bytes of the
stream since offsets which occur in records assume the existence
of those bytes. So if we want to skip the first record in a
stream, then what we really want to do is just set the begin
pointer to the next record. But we shouldn't actually remove
those bytes from the underlying view of the data.
llvm-svn: 349066
In the ABI for the 64-bit Arm architecture the section on weak references
states:
During linking, the symbol value of an undefined weak reference is:
- Zero if the relocation type is absolute
- The address of the place if the relocation type is pc-relative.
The relocations associated with an ADRP are relative so we should resolve
the undefined weak reference to the place instead of 0. This matches GNU
ld.bfd behaviour.
fixes pr34928
Differential Revision: https://reviews.llvm.org/D55599
llvm-svn: 349024
Unlike GNU tar and libarchive bsdtar, NetBSD 'tar -t' output does not
use C-style escapes and instead outputs paths literally. Fix the test
to account both for escaped and literal backslash output.
Differential Revision: https://reviews.llvm.org/D55441
llvm-svn: 348628
This was a missing piece.
We started to print LMAs and information about assignments,
but did not do that for assignments outside of section declarations yet.
The patch implements it.
Differential revision: https://reviews.llvm.org/D45314
llvm-svn: 348468
The test breaks on buildbots that don't enable the x86 backend. Other
tests in this directory explicitly require x86, so this should do the
trick.
llvm-svn: 348466
This is a part of
https://bugs.llvm.org/show_bug.cgi?id=39885
Linker script specification says:
"You can specify a file name to include sections from a particular file. You would
do this if one or more of your files contain special data that needs to be at a
particular location in memory."
LLD did not accept this syntax. The patch implements it.
Differential revision: https://reviews.llvm.org/D55324
llvm-svn: 348463
Previously, we have a hash table containing strings and their offsets
to manage mergeable strings. Technically we can live without that, because
we can do binary search on a vector of mergeable strings to find a mergeable
strings.
We did have both the hash table and the binary search because we thought
that that is faster.
We recently observed that lld tend to consume more memory than gold when
building an output with debug info. A few percent of memory is consumed by
the hash table. So, we needed to reevaluate whether or not having the extra
hash table is a good CPU/memory tradeoff. I run a few benchmarks with and
without the hash table.
I got a mixed result for the benchmark. We observed a regression for some
programs by removing the hash table (that's what we expected), but we also
observed that performance imrpovements for some programs. This is perhaps
due to reduced memory usage.
Differential Revision: https://reviews.llvm.org/D55234
llvm-svn: 348401
Previously these were dropped. We now understand them sufficiently
well to start emitting them. From the debugger's perspective, this
now enables us to have debug info about typedefs (both global and
function-locally scoped)
Differential Revision: https://reviews.llvm.org/D55228
llvm-svn: 348306
Patch from Andrew Kelley.
For context, see https://bugs.llvm.org/show_bug.cgi?id=39862
The use case is embedded / OS programming where the kernel wants
access to its own debug info via mapped dwarf info. I have a proof of
concept of this working, using this linker script snippet:
.rodata : ALIGN(4K) {
*(.rodata)
__debug_info_start = .;
KEEP(*(.debug_info))
__debug_info_end = .;
__debug_abbrev_start = .;
KEEP(*(.debug_abbrev))
__debug_abbrev_end = .;
__debug_str_start = .;
KEEP(*(.debug_str))
__debug_str_end = .;
__debug_line_start = .;
KEEP(*(.debug_line))
__debug_line_end =
.;
__debug_ranges_start
= .;
KEEP(*(.debug_ranges))
__debug_ranges_end
= .;
}
Differential revision: https://reviews.llvm.org/D55276
llvm-svn: 348291
When linking the linux kernel on ppc64le
ld.lld -EL -m elf64lppc -Bstatic --orphan-handling=warn --build-id -o
.tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive
built-in.a --no-whole-archive --start-group lib/lib.a --end-group
ld.lld: error: discarding .rela.plt section is not allowed
The linker script discards with the following matches
*(.glink .iplt .plt .rela* .comment)
Differential Revision: https://reviews.llvm.org/D54871
llvm-svn: 348258
When linking the linux kernel on ppc64 and ppc
ld.lld: error: unrecognized reloc 11
11 is PPC_REL14 and PPC64_REL14
Differential revision: https://reviews.llvm.org/D54868
llvm-svn: 348255
At least on Linux, if a file size given to FileOutputBuffer is greater
than 2^63, it fails with "Invalid argument" error, which is not a
user-friendly error message. With this patch, lld prints out "output
file too large" instead.
llvm-svn: 348153
There is no need to check that In.DynSymTab != nullptr,
because `includeInDynsym` already checks for `!Config->HasDynSymTab`
and `HasDynSymTab` is the pre-condition for In.DynSymTab creation.
llvm-svn: 348143
Now LLD might build the broken/incomplete .gdb_index when some DWARF v5
sections (like .debug_rnglists and .debug_addr) are used.
Particularly, for the case above, we emit an empty address area.
A test case is provided and patch fixes the issue.
Differential revision: https://reviews.llvm.org/D55109
llvm-svn: 348119
We initialize .text section with 0xcc (INT3 instruction), so we need to
explicitly write data even if it is zero if it can be in a .text section.
If you specify /merge:.rdata=.text, .rdata (which contains .idata) is put
to .text, so we need to do this.
Fixes https://bugs.llvm.org/show_bug.cgi?id=39826
Differential Revision: https://reviews.llvm.org/D55098
llvm-svn: 348000
The _GLOBAL_OFFSET_TABLE_ is a linker defined symbol that is placed at
some location relative to the .got, .got.plt or .toc section. On some
targets such as Arm the correctness of some code sequences using a
relocation to _GLOBAL_OFFSET_TABLE_ depend on the value of the symbol
being in the linker defined place. Follow the ld.gold example and give
a multiple symbol definition error. The ld.bfd behaviour is to ignore the
definition in the input object and redefine it, which seems like it could
be more surprising.
fixes pr39587
Differential Revision: https://reviews.llvm.org/D54624
llvm-svn: 347854
This is an reland of rL343155 which got reverted because
of a sphinx failure on the buildbot.
Differential Revision: https://reviews.llvm.org/D54982
llvm-svn: 347830
Summary:
This reinstates what I originally intended to do in D54361.
It removes the assumption that .debug_gnu_pubnames has increasing CuOffset.
Now we do better than gold here: when .debug_gnu_pubnames contains
multiple sets, gold would think every set has the same CU index as the
first set (incorrect).
Reviewed By: ruiu
Reviewers: ruiu, dblaikie, espindola
Subscribers: emaste, arichardson, arphaman, llvm-commits
Differential Revision: https://reviews.llvm.org/D54483
llvm-svn: 347820
This patch also makes getPltEntryOffset a non-member function because
it doesn't depend on any private members of the TargetInfo class.
I tried a few different ideas, and it seems this change fits in best to me.
Differential Revision: https://reviews.llvm.org/D54981
llvm-svn: 347781
The DT_PLTRELSZ dynamic tag is calculated using the size of the
OutputSection containing the In.RelaPlt InputSection. This will work for the
default no linker script case and the majority of linker scripts.
Unfortunately it doesn't work for some 'almost' sensible linker scripts. It
is permitted by ELF to have a single OutputSection containing both
In.RelaDyn, In.RelaPlt and In.RelaIPlt. It is also permissible for the range
of memory [DT_RELA, DT_RELA + DT_RELASZ) and the range
[DT_JMPREL, DT_JMPREL + DT_JMPRELSZ) to overlap as long as the the latter
range is at the end.
To support this type of linker script use the specific InputSection sizes.
Fixes pr39678
Differential Revision: https://reviews.llvm.org/D54759
llvm-svn: 347736
The number of sections is used in assignAddresses (in
finalizeAddresses) and the space for all sections is permanent from
that point on, even if we later decide we won't write some of them.
The VirtualSize field also gets calculated in assignAddresses, so we
need to manually check whether the section is empty here instead.
Differential Revision: https://reviews.llvm.org/D54495
llvm-svn: 347704
Summary:
This speeds up linking clang.exe/pdb with /DEBUG:GHASH by 31%, from
12.9s to 9.8s.
Symbol records are typically small (16.7 bytes on average), but we
processed them one at a time. CVSymbol is a relatively "large" type. It
wraps an ArrayRef<uint8_t> with a kind an optional 32-bit hash, which we
don't need. Before this change, each DbiModuleDescriptorBuilder would
maintain an array of CVSymbols, and would write them individually with a
BinaryItemStream.
With this change, we now add symbols that happen to appear contiguously
in bulk. For each .debug$S section (roughly one per function), we
allocate two copies, one for relocation, and one for realignment
purposes. For runs of symbols that go in the module stream, which is
most symbols, we now add them as a single ArrayRef<uint8_t>, so the
vector DbiModuleDescriptorBuilder is roughly linear in the number of
.debug$S sections (O(# funcs)) instead of the number of symbol records
(very large).
Some stats on symbol sizes for the curious:
PDB size: 507M
sym bytes: 316,508,016
sym count: 18,954,971
sym byte avg: 16.7
As future work, we may be able to skip copying symbol records in the
linker for realignment purposes if we make LLVM write them aligned into
the object file. We need to double check that such symbol records are
still compatible with link.exe, but if so, it's definitely worth doing,
since my profile shows we spend 500ms in memcpy in the symbol merging
code. We could potentially cut that in half by saving a copy.
Alternatively, we could apply the relocations *after* we iterate the
symbols. This would require some careful re-engineering of the
relocation processing code, though.
Reviewers: zturner, aganea, ruiu
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D54554
llvm-svn: 347687