Handling of --export/--undefined can pull in lazy symbols which in turn
can pull in referenced to optional symbols. We need to delay the
creation of optional symbols until all possible references to them have
been created.
Differential Revision: https://reviews.llvm.org/D66768
llvm-svn: 370012
PR42990. For `SECTIONS { b = a; . = 0xff00 + (a >> 8); a = .; }`,
we currently set st_value(a)=0xff00 while st_value(b)=0xffff.
The following call tree demonstrates the problem:
```
link<ELF64LE>(Args);
Script->declareSymbols(); // insert a and b as absolute Defined
Writer<ELFT>().run();
Script->processSectionCommands();
addSymbol(cmd); // a and b are re-inserted. LinkerScript::getSymbolValue
// is lazily called by subsequent evaluation
finalizeSections();
forEachRelSec(scanRelocations<ELFT>);
processRelocAux // another problem PR42506, not affected by this patch
finalizeAddressDependentContent(); // loop executed once
script->assignAddresses(); // a = 0, b = 0xff00
script->assignAddresses(); // a = 0xff00, _end = 0xffff
```
We need another assignAddresses() to finalize the value of `a`.
This patch
1) modifies assignAddress() to track the original section/value of each
symbol and return a symbol whose section/value has changed.
2) moves the post-finalizeSections assignAddress() inside the loop
of finalizeAddressDependentContent() and makes it iterative.
Symbol assignment may not converge so we make a few attempts before
bailing out.
Note, assignAddresses() must be called at least twice. The penultimate
call finalized section addresses while the last finalized symbol values.
It is somewhat obscure and there was no comment.
linkerscript/addr-zero.test tests this.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66279
llvm-svn: 369889
--strip-all suppresses the creation of in.symtab
This can cause a null pointer dereference in OutputSection::finalize()
// --emit-relocs => copyRelocs is true
if (!config->copyRelocs || (type != SHT_RELA && type != SHT_REL))
return;
...
link = in.symTab->getParent()->sectionIndex; // in.symTab is null
Let's just disallow the combination. In some cases the combination can
cause GNU linkers to fail:
* ld.bfd: final link failed: invalid operation
* gold: internal error in set_no_output_symtab_entry, at ../../gold/object.h:1814
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D66704
llvm-svn: 369878
Reported at https://reviews.llvm.org/D64930#1642223
If the only section of a PT_LOAD is a SHT_NOBITS section (e.g. .bss), we
may not align its sh_offset. p_offset of the PT_LOAD will be set to
sh_offset, and we will get p_offset!=p_vaddr (mod p_align). If such
executable is mapped by the Linux kernel, it will segfault.
After D64906, this may happen the non-linker script case.
The linker script case has had this issue for a long time.
This was fixed by rL321657 (but the test linkerscript/nobits-offset.s
failed to test a SHT_NOBITS section), but broken by rL345154.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D66658
llvm-svn: 369828
Summary:
This adds the -lto-obj-path option to lld-link. This can be
used to specify a path at which to write a native object file for
the full LTO part when using LTO unit splitting.
Reviewers: ruiu, tejohnson, pcc, rnk
Reviewed By: ruiu, rnk
Subscribers: mehdi_amini, steven_wu, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65964
llvm-svn: 369559
llvm-objdump can switch between ARM/Thumb states after D60927.
In a few lld tests, we run both
* llvm-objdump -d -triple=thumbv7a-none-linux-gnueabi %t
* llvm-objdump -d -triple=armv7a-none-linux-gnueabi %t
to test ARM/Thumb parts of the same file. In many cases we can just
run one command. There is a problem that prevents us from cleaning
more tests (e.g. test/ELF/arm-thumb-interwork-thunk.s):
In llvm-objdump, while we have ARM/Thumb (primary and secondary)
MCDisassembler and MCSubtargetInfo, we have just one MCInstrAnalysis
which is used to resolve the targets of calls in both ARM/Thumb parts.
// ThumbMCInstrAnalysis evaluating ARM parts or ARMMCInstrAnalysis evaluating Thumb parts
// will have incorrect offsets.
// An example of llvm-objdump -d -triple=thumbv7a on ARM part:
1304: 3d ff ff fa blx #-780 # no <...>
1308: 06 00 00 ea b #24 <arm_caller+0x24> # wrong target due to wrong offset
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D66539
llvm-svn: 369535
This removes the precompiled binary and improves the
check of the error reported.
Differential revision: https://reviews.llvm.org/D66523
llvm-svn: 369516
This fixed a bug in r369488. When config->isRela is false, i->r_addend
is not initialized (see encodeDynamicReloc). So we should check
config->isRela before accessing r_addend:
- if (j - i < 3 || i->r_addend)
+ if (j - i < 3 || (config->isRela && i->r_addend != 0))
Original description:
Currently, with Android dynamic relocation packing, only relative
relocations are grouped together. This patch implements similar
packing for non-relative relocations.
The implementation groups non-relative relocations with the same
r_info and r_addend, if using RELA. By requiring a minimum group
size of 3, this achieves smaller relocation sections. Building Android
for an ARM32 device, I see the total size of /system/lib decrease by
392 KB.
Grouping by r_info also allows the runtime dynamic linker to implement
an 1-entry cache to reduce the number of symbol lookup required. With
such 1-entry cache implemented on Android, I'm seeing 10% to 20%
reduction in total time spent in runtime linker for several executables
that I tested.
As a simple correctness check, I've also built x86_64 Android and booted
successfully.
Differential Revision: https://reviews.llvm.org/D65242
Patch by Vic Yang
llvm-svn: 369507
Currently, with Android dynamic relocation packing, only relative
relocations are grouped together. This patch implements similar
packing for non-relative relocations.
The implementation groups non-relative relocations with the same
r_info and r_addend, if using RELA. By requiring a minimum group
size of 3, this achieves smaller relocation sections. Building Android
for an ARM32 device, I see the total size of /system/lib decrease by
392 KB.
Grouping by r_info also allows the runtime dynamic linker to implement
an 1-entry cache to reduce the number of symbol lookup required. With
such 1-entry cache implemented on Android, I'm seeing 10% to 20%
reduction in total time spent in runtime linker for several executables
that I tested.
As a simple correctness check, I've also built x86_64 Android and booted
successfully.
Differential Revision: https://reviews.llvm.org/D66491
Patch by Vic Yang!
llvm-svn: 369488
This avoids producing an output file if errors appeared late in the
linking process (e.g. while fixing relocations, or as in the test,
while checking for multiple resources). If an output file is produced,
build tools might not retry building it on rebuilds, even if a previous
build failed due to the error return code.
Differential Revision: https://reviews.llvm.org/D66491
llvm-svn: 369445
Debug sections are special in that they can contain relocations against
symbols that are not present in the final output (i.e. not live).
However it is also possible to have R_WASM_TABLE_INDEX relocations
against symbols that don't have a table index assigned (since they are
not address taken by actual code.
Fixes: https://github.com/emscripten-core/emscripten/issues/9023
Differential Revision: https://reviews.llvm.org/D66435
llvm-svn: 369423
This is used by Wine for manually crafting export tables.
If the input object contains .edata sections, GNU ld references them
in the export directory instead of synthesizing an export table using
either export directives or the normal auto export mechanism. (AFAIK,
historically, way way back, GNU ld didn't support synthesizing the
export table - one was supposed to generate it using dlltool and link
it in instead.)
If faced with --out-implib and --output-def, GNU ld still populates
those output files with the same export info as it would have generated
otherwise, disregarding the input .edata. As this isn't an intended
usage combination, I'm not adding checks for that in tests.
Differential Revision: https://reviews.llvm.org/D65903
llvm-svn: 369358
Ported the D64906 technique to EM_386.
If `sh_addralign(.tdata) < sh_addralign(.tbss)`,
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`.
ld.so that are known to have problems if p_vaddr%p_align!=0:
* FreeBSD 13.0-CURRENT rtld-elf
* glibc https://sourceware.org/bugzilla/show_bug.cgi?id=24606
New test i386-tls-vaddr-align.s checks our workaround makes p_vaddr%p_align = 0.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D65865
llvm-svn: 369347
Ported the D64906 technique to AArch64. It deletes 3 alignments at
PT_LOAD boundaries for the default case: the size of an aarch64 binary
decreases by at most 192kb.
If `sh_addralign(.tdata) < sh_addralign(.tbss)`,
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`.
ld.so that are known to have problems if p_vaddr%p_align!=0:
* musl<=1.1.22
* FreeBSD 13.0-CURRENT (and before) rtld-elf arm64
New test aarch64-tls-vaddr-align.s checks that our workaround makes p_vaddr%p_align = 0.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64930
llvm-svn: 369344
This change affects the non-linker script case (precisely, when the
`SECTIONS` command is not used). It deletes 3 alignments at PT_LOAD
boundaries for the default case: the size of a powerpc64 binary can be
decreased by at most 192kb. The technique can be ported to other
targets.
Let me demonstrate the idea with a maxPageSize=65536 example:
When assigning the address to the first output section of a new PT_LOAD,
if the end p_vaddr of the previous PT_LOAD is 0x10020, we advance to
the next multiple of maxPageSize: 0x20000. The new PT_LOAD will thus
have p_vaddr=0x20000. Because p_offset and p_vaddr are congruent modulo
maxPageSize, p_offset will be 0x20000, leaving a p_offset gap [0x10020,
0x20000) in the output.
Alternatively, if we advance to 0x20020, the new PT_LOAD will have
p_vaddr=0x20020. We can pick either 0x10020 or 0x20020 for p_offset!
Obviously 0x10020 is the choice because it leaves no gap. At runtime,
p_vaddr will be rounded down by pagesize (65536 if
pagesize=maxPageSize). This PT_LOAD will load additional initial
contents from p_offset ranges [0x10000,0x10020), which will also be
loaded by the previous PT_LOAD. This is fine if -z noseparate-code is in
effect or if we are not transiting between executable and non-executable
segments.
ld.bfd -z noseparate-code leverages this technique to keep output small.
This patch implements the technique in lld, which is mostly effective on
targets with large defaultMaxPageSize (AArch64/MIPS/PPC: 65536). The 3
removed alignments can save almost 3*65536 bytes.
Two places that rely on p_vaddr%pagesize = 0 have to be updated.
1) We used to round p_memsz(PT_GNU_RELRO) up to commonPageSize (defaults
to 4096 on all targets). Now p_vaddr%commonPageSize may be non-zero.
The updated formula takes account of that factor.
2) Our TP offsets formulae are only correct if p_vaddr%p_align = 0.
Fix them. See the updated comments in InputSection.cpp for details.
On targets that we enable the technique (only PPC64 now),
we can potentially make `p_vaddr(PT_TLS)%p_align(PT_TLS) != 0`
if `sh_addralign(.tdata) < sh_addralign(.tbss)`
This exposes many problems in ld.so implementations, especially the
offsets of dynamic TLS blocks. Known issues:
FreeBSD 13.0-CURRENT rtld-elf (i386/amd64/powerpc/arm64)
glibc (HEAD) i386 and x86_64 https://sourceware.org/bugzilla/show_bug.cgi?id=24606
musl<=1.1.22 on TLS Variant I architectures (aarch64/powerpc64/...)
So, force p_vaddr%p_align = 0 by rounding dot up to p_align(PT_TLS).
The technique will be enabled (with updated tests) for other targets in
subsequent patches.
Reviewed By: ruiu
Differential Revision: https://reviews.llvm.org/D64906
llvm-svn: 369343
After D66007/r369262, if the control flow reaches `if (sym.isUndefined())`, we know:
* The relocation is not a link-time constant => symbol is preemptable => Undefined or SharedSymbol
* Not an undef weak.
* -no-pie.
* The symbol type is neither STT_OBJECT nor STT_FUNC.
ld.lld --export-dynamic --unresolved-symbols=ignore-all %t.o can satisfy
these conditions. Delete the isUndefined() test so that we error
`symbol '...' has no type`, because we don't know the type to make the
decision to create copy relocation/canonical PLT.
llvm-svn: 369271
In processRelocAux(), we handle errors before copy relocation/canonical PLT.
This makes error checking a bit complex because we have to check for
conditions that will be allowed by copy relocation/canonical PLT.
Instead, move copy relocation/canonical PLT before error checking. This
simplifies the previous clumsy error checking code
`config->shared || (config->pie && expr == R_ABS && type != target->symbolicRel)`
to the simple `config->isPic`. Some diagnostics can be reported in
different ways. The code motion changes diagnostics for some contrived
test cases:
* copy-rel-pie-error.s -> copy-rel-pie2.s:
It was rejected before but accepted now. ld.bfd also accepts the case.
* copy-errors.s: "cannot preempt symbol" changes to "symbol 'bar' has no type"
* got32{,x}-i386.s: the suggestion changes from "-fPIC or -Wl,-z,notext" to "-fPIE"
* x86-64-dyn-rel-error5.s: one diagnostic changes for -pie case
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D66007
llvm-svn: 369262
Add a test that takes the maximum amount of passes permitted to converge.
This will make sure that any symbol defined in a linker script gets the
correct value and that any other convergence limit involving symbol address
doesn't restrict Thunk convergence.
Differential Revision: https://reviews.llvm.org/D66346
llvm-svn: 369246
Fixes https://github.com/ClangBuiltLinux/linux/issues/640
R_PPC64_REL16_HI was incorrectly computed as an R_ABS relocation.
rLLD368964 made it a linker failure. Change it to use R_PC to fix the
failures.
Add ppc64-reloc-rel.s for these R_PPC64_REL* tests.
llvm-svn: 369184
Like rLLD354040.
Previously, for unrecognized relocation types, in -no-pie/-pie mode, we got something like:
foo.o: unrecognized relocation ...
In -shared mode:
error: can't create dynamic relocation ... against symbol: yyy in readonly segment
Delete the default case from AArch64::getRelExpr and add the error there.
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D66277
llvm-svn: 368983
Some options are implemented now:
--no-warn-common : r263413
--allow-shlib-undefined : r352826
Some are ignored but were not reflected in this test.
llvm-svn: 368837
Support the equals form of the long --entry=<symbol> option,
add a test for the -e<symbol> form.
Add tests for single dash forms of -exclude-all-symbols and
-export-all-symbols.
Support single-dash forms of -out-implib and -output-def, support
the equals form of --output-def=<file>. (We previously had a test
to explicitly disallow -out-implib, but it turns out that GNU ld
actually does support it just fine, despite also matching the
-o<file> option.)
Disallow the double-dashed --u form, add a test for -u<symbol>.
Differential Revision: https://reviews.llvm.org/D66066
llvm-svn: 368816
Currently the following 3 relocation types do not trigger the creation
of a canonical PLT (which changes STT_GNU_IFUNC to STT_FUNC and
redirects all references):
1) GOT-generating (`needsGot`)
2) PLT-generating (`needsPlt`)
3) R_ABS with 0 addend in a writable location. This is used for
for ifunc function pointers in writable sections such as .data and .toc.
This patch deletes case 3) to simplify the R_*_IRELATIVE generating
logic added in D57371. Other advantages:
* It is guaranteed no more than 1 R_*_IRELATIVE is created for an ifunc.
* PPC64: no need to special case ifunc in toc-indirect to toc-relative relaxation. See D65755
The deleted elf::addIRelativeRelocs demonstrates that one-pass scan
through relocations makes several optimizations difficult. This is
something we can think about in the future.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D65995
llvm-svn: 368661
When producing a DSO, the isPreemptible property of a Defined with
default or protected visibility is affected by the --dynamic-list file,
but not by interposable symbols in other DSOs.
llvm-svn: 368649
The filename part in the message header is used by Visual Studio
to fill Error List so that a user can click on an item and jump
to the mentioned location. If we use only the name of a source file
and not the full path, Visual Studio might be unable to find the right
file or, even worse, show a wrong one.
Differential Revision: https://reviews.llvm.org/D65875
llvm-svn: 368409
If the dot gets moved by an explicit section address, an empty gap between sections could be created. The encompassing region for the section being parsed needs to be expanded to include the gap.
Differential Revision: https://reviews.llvm.org/D65722
Patch by Gabriel Smith!
llvm-svn: 368379
This ensures these errors produce a non-zero exit and improves the
context (providing the name of the input object and section being
parsed).
llvm-svn: 368378
In the case where C identifier sections have SHF_LINK_ORDER they will most
likely be placed in the same partition as the section that they are associated
with. But unless this happens to be the main partition, this will cause them
to be excluded from the range covered by the __start_ and __stop_ symbols,
which may lead to incorrect program behaviour. So we need to move them
all into the main partition so that they will be covered by the __start_
and __stop_ symbols.
We may want to refine this approach later and allow different __start_/__stop_
symbol values for different partitions. This would only make sense for
relocations from SHT_NOTE sections since they are duplicated into each
partition.
Differential Revision: https://reviews.llvm.org/D65909
llvm-svn: 368375
This allows undefined references in input files be resolved by the
optional symbols. Previously we were doing this before input file
reading which means it was working only for command line symbols
references (i.e. -u or --export).
Also use addOptionalDataSymbol for __dso_handle and make all optional
symbols hidden by default.
Differential Revision: https://reviews.llvm.org/D65920
llvm-svn: 368310
This patch Implements the R_AARCH64_TLSLE_MOVW_TPREL_G*[_NC]. These are
logically the same calculation as the existing TLSLE relocations with
the result written back to mov[nz] and movk instructions. A typical code
sequence is:
movz x0, #:tprel_g2:foo // bits [47:32] of R_TLS with overflow check
movk x0, #:tprel_g1_nc:foo // bits [31:16] of R_TLS with no overflow check
movk x0, #:tprel_g0_nc:foo // bits [15:0] of R_TLS with no overflow check
This type of code sequence is usually used with a large code model.
Differential Revision: https://reviews.llvm.org/D65882
Fixes: PR42853
llvm-svn: 368293
There's still a need for a deeper fix to the way libDebugInfoDWARF error
messages are propagated up to lld - if lld had exited non-zero on this
error message we would've found the issue sooner.
llvm-svn: 368229
D65213 (rL367536) does not work for the case when a source file path
includes subdirectories.
Differential Revision: https://reviews.llvm.org/D65810
llvm-svn: 368153