Commit Graph

15048 Commits

Author SHA1 Message Date
Alexander Shaposhnikov 4450a2a23d [lld][ELF] Add support for ADRP+ADD optimization for AArch64
This diff adds support for ADRP+ADD optimization for AArch64 described in
d2ca58c54b
i.e. under appropriate constraints

ADRP  x0, symbol
ADD   x0, x0, :lo12: symbol

can be turned into

NOP
ADR   x0, symbol

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D117614
2022-02-02 06:09:55 +00:00
Jez Ng 3e951808d5 [lld-macho][nfc] Comments and style fixes
Added some comments (particularly around finalize() and
finalizeContents()) as well as doing some rephrasing / grammar fixes for
existing comments.

Also did some minor style fixups, such as by putting methods together in
a class definition and having fields of similar types next to each
other.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D118714
2022-02-01 13:45:59 -05:00
Fangrui Song fbf2f66400 [ELF] Update flag propagation rule to ignore discarded output sections
See the updated insert-before.test for the effects: many synthetic
sections are SHF_ALLOC|SHF_WRITE. If they are discarded, we don't want
to propagate their flags to subsequent output section descriptions.

`getFirstInputSection(sec) == nullptr` can technically be merged into
`isDiscardable` but I'd like to postpone that as not sharing code may give more
refactoring opportunity.

Depends on D118529.

Reviewed By: peter.smith, bluca

Differential Revision: https://reviews.llvm.org/D118530
2022-02-01 10:19:30 -08:00
Fangrui Song a0318711c8 [ELF] Rename adjustSectionsBeforeSorting to adjustOutputSections and make it affect INSERT commands
adjustSectionsBeforeSorting updates some output section attributes
(alignment/flags) and removes discardable empty sections. When it is called,
INSERT commands have not been processed. Therefore the flags propagation rule
may not affect output sections defined in an INSERT command properly.

Fix this by moving processInsertCommands before adjustSectionsBeforeSorting.

adjustSectionsBeforeSorting is somewhat misnamed. The order between it and
sortInputSections does not matter. With the pass shuffle, the name of
adjustSectionsBeforeSorting becomes wrong. Therefore rename it. The new
name is not set into stone. The function mixes several tasks and the
code may be refactored in a way that we may give them more meaningful
names.

With this patch, I think the behavior of attribute propagation becomes more
reasonable. In particular, in the absence of non-INSERT SECTIONS,
inserting a section after a SHF_ALLOC one will give us a SHF_ALLOC section,
not a non-SHF_ALLOC one (see linkerscript/insert-after.test).

Reviewed By: peter.smith, bluca

Differential Revision: https://reviews.llvm.org/D118529
2022-02-01 10:16:12 -08:00
Fangrui Song 0c3704fdbd [ELF] Deduplicate names of local symbols only with -O2
The deduplication requires a DenseMap of the same size of the local part of
.strtab . I optimized it in e205445434 but it is
still quite slow.

For Release build of clang, deduplication makes .strtab 1.1% smaller and makes the link 3% slower.
For chrome, deduplication makes .strtab 0.1% smaller and makes the link 6% slower.

I suggest that we only perform the optimization with -O2 (default is -O1).
Not deduplicating local symbol names will simplify parallel symbol table write.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D118577
2022-02-01 10:10:22 -08:00
Fangrui Song 17a39aecd1 [ELF] Simplify code with invokeELFT. NFC 2022-02-01 09:53:29 -08:00
Fangrui Song 7518d38f0a [ELF] De-template LinkerDriver::link. NFC
Replace `f<ELFT>(x)` with `InvokeELFT(f, x)`.
The size reduction comes from turning `link` from 4 specializations into 1.

My x86-64 lld executable is 26KiB smaller.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D118551
2022-02-01 09:47:56 -08:00
Alexander Shaposhnikov 7244901ef6 [lld][MachO] Fix typo in rename.s 2022-02-01 11:57:04 +00:00
Alexander Shaposhnikov f131d4d0d0 [lld][ELF] Add missing RUN in aarch64-adrp-ldr-got.s 2022-02-01 11:25:16 +00:00
Fangrui Song 4d38d7684c [ELF] Change vector<Symbol *> to SmallVector. NFC 2022-02-01 00:16:42 -08:00
Fangrui Song 196aedb843 [ELF] Change vector<InputSection *> to SmallVector. NFC
My x86-64 lld executable is 8KiB smaller.
2022-02-01 00:14:21 -08:00
Fangrui Song d97749fabc [ELF] Switch split-stack to use SmallVector. NFC
My x86-64 lld executable is 1.1KiB smaller.
2022-02-01 00:09:30 -08:00
Jez Ng 96fb7d059d [lld-macho][test] Add test for UUID format
Reviewed By: keith

Differential Revision: https://reviews.llvm.org/D118646
2022-01-31 23:52:42 -05:00
Fangrui Song 7aaf024dac [BitcodeWriter] Fix cases of some functions
`WriteIndexToFile` is used by external projects so I do not touch it.
2022-01-31 16:46:11 -08:00
Fangrui Song 457273fda5 [ELF] splitStrings: replace entSize==1 special case with manual loop unswitch. NFC
My x86-64 lld executable is actually smaller.
2022-01-30 17:15:45 -08:00
Fangrui Song 7cd0c45364 [ELF] Simplify SectionBase::partition handling and make it live by default. NFC
Previously an InputSectionBase is dead (`partition==0`) by default.
SyntheticSection calls markLive and BssSection overrides that with markDead.

It is more natural to make InputSectionBase live by default and let
--gc-sections mark InputSectionBase dead.

When linking a Release build of clang:

* --no-gc-sections:, the removed `inputSections` loop decreases markLive time from 4ms to 1ms.
* --gc-sections: the extra `inputSections` loop increases markLive time from 0.181296s to 0.188526s.
  This is as of we lose the removing one `inputSections` loop optimization (4374824ccf).
  I believe the loss can be mitigated if we refactor markLive.
2022-01-30 15:12:09 -08:00
Fangrui Song 73fd7d2304 [ELF] Change splitSections to objectFiles based parallelForEach. NFC
The work is more balanced.
2022-01-30 13:34:27 -08:00
Keith Smiley a6298fb160 [lld-macho] Add support for -add_empty_section
This is a ld64 option equivalent to `-sectcreate seg sect /dev/null`
that's useful for creating sections like the RESTRICT section.

Differential Revision: https://reviews.llvm.org/D117749
2022-01-30 10:03:41 -08:00
Keith Smiley 0ab09a9009 [test][lld-macho] Improve LC_FUNCTION_STARTS test coverage
Previously functions that aren't included in the symtab were also
excluded from the function starts. Symbols missing from function starts
degrades the debugger experience in the case you don't have debug info
for them.

Differential Revision: https://reviews.llvm.org/D114275
2022-01-30 09:46:36 -08:00
Fangrui Song 5a2020d069 [ELF] copyShtGroup: replace unordered_set<uint32_t> with DenseSet<uint32_t>. NFC
We don't need to support the empty/tombstone key section index.
2022-01-30 01:18:41 -08:00
Fangrui Song f318fd9bf8 [ELF] crtbegin/crtend test: replace std::regex with hand-written matcher. NFC
My x86-64 lld executable is 18KiB smaller.
2022-01-30 01:11:19 -08:00
Fangrui Song a7f9c002cd [ELF][test] Test {crtbegin,crtend}{S,T}.o 2022-01-30 01:08:10 -08:00
Fangrui Song fcd8817da5 [ELF] Simplify maybeCompress with lld::split. NFC 2022-01-30 00:44:19 -08:00
Fangrui Song bc1369fae3 [ELF] Optimize MergeInputSection::splitNonStrings with resize_for_overwrite. NFC 2022-01-30 00:10:52 -08:00
Fangrui Song 988a03c585 [ELF] Add some Mips*Section to InStruct and change make<Mips*Section> to std::make_unique
Similar to D116143. My x86-64 lld executable is 20+KiB smaller.
2022-01-29 23:55:29 -08:00
Fangrui Song c0b986aa0c [ELF] Remove make<std::unique_ptr<MemoryBuffer>>. NFC 2022-01-29 23:35:15 -08:00
Fangrui Song 8d8fce87bb [ELF] De-template getErrorPlace. NFC 2022-01-29 23:05:54 -08:00
Fangrui Song 72a005bf19 [ELF] De-template getAndFeatures. NFC 2022-01-29 20:11:59 -08:00
Fangrui Song d754c0b64f [ELF] Make errorOrWarn opaque to decrease code size. NFC
In my x86-64 lld, .text is -3.08Ki smaller.
2022-01-29 19:31:09 -08:00
Fangrui Song ee647d4c96 [ELF] Optimize obj.getSectionIndex. NFC 2022-01-29 18:01:58 -08:00
Fangrui Song 5d00d37617 [ELF] Simplify eSyms. NFC 2022-01-29 17:00:38 -08:00
Fangrui Song d86435c230 [ELF] createInputSection: remove unneeded argument. NFC 2022-01-29 16:52:32 -08:00
Fangrui Song ee7720acd6 [ELF] Avoid repeated getObj construction in getSectionIndex. NFC 2022-01-29 16:51:00 -08:00
Fangrui Song 94e97e668c [ELF] Reorder InputSectionBase::parent. NFC
Move it before others.
2022-01-29 16:20:40 -08:00
Fangrui Song b204d7c459 [ELF] Reorder InputFile members. NFC
`symbols` is used frequently. Moving it before others can decrease offsets.
2022-01-29 16:10:52 -08:00
Fangrui Song 469c4124ab [ELF] --gdb-index: switch to SmallVector. NFC 2022-01-29 15:24:56 -08:00
Fangrui Song da0e5b885b [ELF] Refactor -z combreloc
* `RelocationBaseSection::addReloc` increases `numRelativeRelocs`, which
  duplicates the work done by RelocationSection<ELFT>::writeTo.
* --pack-dyn-relocs=android has inappropropriate DT_RELACOUNT.
  AndroidPackedRelocationSection does not necessarily place relative relocations
  in the front and DT_RELACOUNT might cause semantics error (though our
  implementation doesn't and Android bionic doesn't use DT_RELACOUNT anyway.)

Move `llvm::partition` to a new function `partitionRels` and compute
`numRelativeRelocs` there. Now `RelocationBaseSection::addReloc` is trivial and
can be moved to the header to enable inlining.

The rest of DynamicReloc and `-z combreloc` handling is moved to the
non-template `RelocationBaseSection::computeRels` to decrease code size. My
x86-64 lld executable is 44+KiB smaller.

While here, rename `sort` to `combreloc`.
2022-01-29 14:45:58 -08:00
Mateusz Mikuła 460830a9c6 [LLD][MinGW] Add --heap argument support
Noticed in https://github.com/msys2/MINGW-packages/pull/10567.

Differential Revision: https://reviews.llvm.org/D118405
2022-01-30 00:01:45 +02:00
Fangrui Song f097c108b8 [ELF][test] Improve INSERT [AFTER|BEFORE] and adjustSectionsBeforeSorting tests 2022-01-28 22:21:13 -08:00
Petr Hosek 71dcd9bd04 [ELF] Change the search order for dependent libraries
When processing dependent libraries, if there's a directory of the same
name as the library being searched for, either in the current directory
or earlier in the search order, LLD will try to open it and report an
error. This is because LLD uses file existence check. To address this
issue we reverse the order, searching the library by basename first
and only considering search paths later, and current directory last.

Differential Revision: https://reviews.llvm.org/D118498
2022-01-28 20:46:01 -08:00
Fangrui Song 33b38339a0 [lld] Add module name to LTO inline asm diagnostic
Close #52781: for LTO, the inline asm diagnostic uses `<inline asm>` as the file
name (lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp) and it is unclear which
module has the issue.

With this patch, we will see the module name (say `asm.o`) before `<inline asm>` with ThinLTO.

```
% clang -flto=thin -c asm.c && myld.lld asm.o -e f
ld.lld: error: asm.o <inline asm>:1:2: invalid instruction mnemonic 'invalid'
        invalid
        ^~~~~~~
```

For regular LTO, unfortunately the original module name is lost and we only get
ld-temp.o.

Reviewed By: #lld-macho, ychen, Jez Ng

Differential Revision: https://reviews.llvm.org/D118434
2022-01-28 11:32:42 -08:00
Roger Kim 422084332a [lld][Macho] Include dead-stripped symbols in mapfile
ld64 outputs dead stripped symbols when using the -dead-strip flag. This change mimics that behavior for lld.

ld64's -dead_strip flag outputs:
```
$ ld -map map basics.o -o out -dead_strip -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem
$ cat map
# Path: out
# Arch: x86_64
# Object files:
[  0] linker synthesized
[  1] basics.o
# Sections:
# Address       Size            Segment Section
0x100003F97     0x00000021      __TEXT  __text
0x100003FB8     0x00000048      __TEXT  __unwind_info
0x100004000     0x00000008      __DATA_CONST    __got
0x100008000     0x00000010      __DATA  __ref_section
0x100008010     0x00000001      __DATA  __common
# Symbols:
# Address       Size            File  Name
0x100003F97     0x00000006      [  1] _ref_local
0x100003F9D     0x00000001      [  1] _ref_private_extern
0x100003F9E     0x0000000C      [  1] _main
0x100003FAA     0x00000006      [  1] _no_dead_strip_globl
0x100003FB0     0x00000001      [  1] _ref_from_no_dead_strip_globl
0x100003FB1     0x00000006      [  1] _no_dead_strip_local
0x100003FB7     0x00000001      [  1] _ref_from_no_dead_strip_local
0x100003FB8     0x00000048      [  0] compact unwind info
0x100004000     0x00000008      [  0] non-lazy-pointer-to-local: _ref_com
0x100008000     0x00000008      [  1] _ref_data
0x100008008     0x00000008      [  1] l_ref_data
0x100008010     0x00000001      [  1] _ref_com

# Dead Stripped Symbols:
#               Size            File  Name
<<dead>>        0x00000006      [  1] _unref_extern
<<dead>>        0x00000001      [  1] _unref_local
<<dead>>        0x00000007      [  1] _unref_private_extern
<<dead>>        0x00000001      [  1] _ref_private_extern_u
<<dead>>        0x00000008      [  1] _unref_data
<<dead>>        0x00000008      [  1] l_unref_data
<<dead>>        0x00000001      [  1] _unref_com
```

Reviewed By: int3, #lld-macho, thevinster

Differential Revision: https://reviews.llvm.org/D114737
2022-01-28 10:51:27 -08:00
Alexander Shaposhnikov 0d71f2e097 [lld][ELF] Cleanup %t directory in tests, NFC 2022-01-28 08:41:52 +00:00
Sam Clegg 875ee937ae [lld][WebAssembly] Handle TLS symbols in older object file
In older versions of llvm (e.g. llvm 13), symbols were not individually
flagged as TLS.  In this case, the indent was to implicitly mark any
symbols defined in TLS segments as TLS.  However, we were not performing
this implicit conversion if the segment was explicitly marked as TLS

As it happens, llvm 13 was branched between the addition of the segment
flag and the addition of the symbol flag. See:

- segment flag added: https://reviews.llvm.org/D102202
- symbol flag added: https://reviews.llvm.org/D109426

Testing this is tricky because the assembler will imply the TLS status
of the symbol based on the segment its declared in, so we are forced to
use a yaml file here.

Fixes: https://github.com/emscripten-core/emscripten/issues/15891

Differential Revision: https://reviews.llvm.org/D118414
2022-01-27 17:27:09 -08:00
Fangrui Song 3bc152769d [ELF] Parallelize computeIsPreemptible 2022-01-26 23:45:04 -08:00
Fangrui Song 1372d53639 [ELF] Optimize two vector. NFC 2022-01-26 23:10:40 -08:00
Fangrui Song afeb4a6628 [ELF] Optimize -Map. NFC
getVA is slow. Avoid calling it in the llvm::sort comparator.
2022-01-26 22:51:31 -08:00
Fangrui Song 14b7785c09 [ELF] Simplify InputSection::writeTo. NFC 2022-01-26 22:03:26 -08:00
Fangrui Song 913914f0f8 [ELF] Simplify writing the Elf_Chdr header. NFC
And avoiding changing `size` in `writeTo`.
2022-01-26 10:23:56 -08:00
Benjamin Kramer f15014ff54 Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17"
This reverts commit ef82063207.

- It conflicts with the existing llvm::size in STLExtras, which will now
  never be called.
- Calling it without llvm:: breaks C++17 compat
2022-01-26 16:55:53 +01:00
serge-sans-paille ef82063207 Rename llvm::array_lengthof into llvm::size to match std::size from C++17
As a conquence move llvm::array_lengthof from STLExtras.h to
STLForwardCompat.h (which is included by STLExtras.h so no build
breakage expected).
2022-01-26 16:17:45 +01:00
Fangrui Song 3704abaa16 [ELF] --gdb-index: replace vector<uint8_t> with unique_ptr<uint8_t[]>. NFC 2022-01-25 23:53:23 -08:00
Fangrui Song 571d6a7120 [ELF] Optimize .relr.dyn to not grow vector<uint64_t>. NFC 2022-01-25 23:33:40 -08:00
Fangrui Song 9fac78d0e1 [ELF] Simplify and optimize .relr.dyn NFC 2022-01-25 22:50:03 -08:00
Fangrui Song 2a80c3dbe1 [ELF] Clarify that Z_BEST_SPEED==1 in a comment. NFC 2022-01-25 22:40:53 -08:00
Fangrui Song 07bd467643 [ELF] --build-id: replace vector<uint8_t> with unique_ptr<uint8_t[]>. NFC
We can't use C++20 make_unique_for_overwrite yet.
2022-01-25 22:39:43 -08:00
Fangrui Song 7438dbe078 [ELF] Cast size to size_t. NFC
To fix

../../chromeclang/bin/../include/c++/v1/__algorithm/min.h:39:1: note: candidate template ignored: deduced conflicting types for parameter '_Tp' ('unsigned long' vs. 'unsigned long long')

on macOS arm64.
2022-01-25 22:38:24 -08:00
Fangrui Song 223f9dea3d [ELF] maybeCompress: replace vector<uint8_t> with unique_ptr<uint8_t[]>. NFC
And mention that it is zero-initialized. I do not notice a speed-up if
changed to be uninitialized by forcing the zero filler in writeTo.
2022-01-25 22:15:44 -08:00
Puyan Lotfi 227d18b3a8 [lld][macho][NFC] Make MachO/start-end.s test less britle by checking for _main:
In start-end.s there is a lit check line `# SEG: _main` to begin the
check at the start of the function main where `_main` is the Darwin name
mangling for C main. Because the text file that FileCheck is getting as
input has the path of the compiler build in it from llvm-mc and
llvm-objdump, and because of the lack of a trailing colon in this check
line we end up inadvertently matching against the line of text with the
compiler path in it in the case where said path contains "_main" some
place. This can be very likely if the compiler branch has "main" or
"_main" in it.

To fix this I include the training : since that will match on the
function label and not the path line.
2022-01-25 19:23:51 -08:00
Fangrui Song 4cdc441690 [ELF] Parallelize --compress-debug-sections=zlib
When linking a Debug build clang (265MiB SHF_ALLOC sections, 920MiB uncompressed
debug info), in a --threads=1 link "Compress debug sections" takes 2/3 time and
in a --threads=8 link "Compress debug sections" takes ~70% time.

This patch splits a section into 1MiB shards and calls zlib `deflake` parallelly.

DEFLATE blocks are a bit sequence. We need to ensure every shard starts
at a byte boundary for concatenation. We use Z_SYNC_FLUSH for all shards
but the last to flush the output to a byte boundary. (Z_FULL_FLUSH can
be used as well, but Z_FULL_FLUSH clears the hash table which just
wastes time.)

The last block requires the BFINAL flag. We call deflate with Z_FINISH
to set the flag as well as flush the output to a byte boundary. Under
the hood, all of Z_SYNC_FLUSH, Z_FULL_FLUSH, and Z_FINISH emit a
non-compressed block (called stored block in zlib). RFC1951 says "Any
bits of input up to the next byte boundary are ignored."

In a --threads=8 link, "Compress debug sections" is 5.7x as fast and the total
speed is 2.54x. Because the hash table for one shard is not shared with the next
shard, the output is slightly larger. Better compression ratio can be achieved
by preloading the window size from the previous shard as dictionary
(`deflateSetDictionary`), but that is overkill.

```
# 1MiB shards
% bloaty clang.new -- clang.old
    FILE SIZE        VM SIZE
 --------------  --------------
  +0.3%  +129Ki  [ = ]       0    .debug_str
  +0.1%  +105Ki  [ = ]       0    .debug_info
  +0.3%  +101Ki  [ = ]       0    .debug_line
  +0.2% +2.66Ki  [ = ]       0    .debug_abbrev
  +0.0% +1.19Ki  [ = ]       0    .debug_ranges
  +0.1%  +341Ki  [ = ]       0    TOTAL

# 2MiB shards
% bloaty clang.new -- clang.old
    FILE SIZE        VM SIZE
 --------------  --------------
  +0.2% +74.2Ki  [ = ]       0    .debug_line
  +0.1% +72.3Ki  [ = ]       0    .debug_str
  +0.0% +69.9Ki  [ = ]       0    .debug_info
  +0.1%    +976  [ = ]       0    .debug_abbrev
  +0.0%    +882  [ = ]       0    .debug_ranges
  +0.0%  +218Ki  [ = ]       0    TOTAL
```

Bonus in not using zlib::compress

* we can compress a debug section larger than 4GiB
* peak memory usage is lower because for most shards the output size is less
  than 50% input size (all less than 55% for a large binary I tested, but
  decreasing the initial output size does not decrease memory usage)

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D117853
2022-01-25 10:29:04 -08:00
Leonard Grey a5c9d71780 [lld-macho] Move order file and call graph sorting into SectionPriorities
See https://reviews.llvm.org/D117354 for context and discussion.
2022-01-25 12:18:15 -05:00
Leonard Grey f23d57a632 [lld-macho] Rename CallGraphSort.{h,cpp} to SectionPriorities
This is in preparation for moving the code that parses and processes
order files into this file.

See https://reviews.llvm.org/D117354 for context and discussion.
2022-01-25 12:15:14 -05:00
Fangrui Song c03fdd3403 [ELF] Fix the branch range computation when reusing a thunk
Notation: dst is `t->getThunkTargetSym()->getVA()`

On AArch64, when `src-0x8000000-r_addend <= dst < src-0x8000000`, the condition
`target->inBranchRange(rel.type, src, rel.sym->getVA(rel.addend))` may
incorrectly consider a thunk reusable.
`rel.addend = -getPCBias(rel.type)` resets the addend to 0 for AArch64/PPC
and the zero addend is used by `rel.sym->getVA(rel.addend)` to check
out-of-range relocations.

See the test for a case this computation is wrong:
`error: a.o:(.text_high+0x4): relocation R_AARCH64_JUMP26 out of range: -134217732 is not in [-134217728, 134217727]`
I have seen a real world case with r_addend=19960.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D117734
2022-01-24 09:03:21 -08:00
serge-sans-paille 5f290c090a Move STLFunctionalExtras out of STLExtras
Only using that change in StringRef already decreases the number of
preoprocessed lines from 7837621 to 7776151 for LLVMSupport

Perhaps more interestingly, it shows that many files were relying on the
inclusion of StringRef.h to have the declaration from STLExtras.h. This
patch tries hard to patch relevant part of llvm-project impacted by this
hidden dependency removal.

Potential impact:
- "llvm/ADT/StringRef.h" no longer includes <memory>,
  "llvm/ADT/Optional.h" nor "llvm/ADT/STLExtras.h"

Related Discourse thread:
https://llvm.discourse.group/t/include-what-you-use-include-cleanup/5831
2022-01-24 14:13:21 +01:00
Peter Smith a08447d0de [LLD][ELF][AArch64] Update test with incorrect REQUIRES line [NFC]
D54759 introduced aarch64-combined-dynrel.s and
aarch64-combined-dynrel-ifunc.s . Unfortunately the requires line
at the top was AArch64 instead of aarch64 which means they were never
run. Update the tests to use aarch64 and fix to match current lld output.

Differential Revision: https://reviews.llvm.org/D117896
2022-01-24 10:04:28 +00:00
Sam Clegg ac2f3df839 [lld][WebAssembly] Remove redundant config setting
Unresolved symbols are not currently reported when building with
`-shared` or `-pie` so setting unresolvedSymbols doesn't have any
effect.

Differential Revision: https://reviews.llvm.org/D117737
2022-01-20 15:21:56 -08:00
Roger Kim f84023a812 [lld][macho] Stop grouping symbols by sections in mapfile.
As per [Bug 50689](https://bugs.llvm.org/show_bug.cgi?id=50689),

```
2. getSectionSyms() puts all the symbols into a map of section -> symbols, but this seems unnecessary. This was likely copied from the ELF port, which prints a section header before the list of symbols it contains. But the Mach-O map file doesn't print these headers.
```

This diff removes `getSectionSyms()` and keeps all symbols in a flat vector.

What does ld64's mapfile look like?
```
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin test.s -o test.o
$ llvm-mc -filetype=obj -triple=x86_64-apple-darwin foo.s -o foo.o
$ ld -map map test.o foo.o -o out -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem
```

```
[  0] linker synthesized
[  1] test.o
[  2] foo.o
0x100003FB7     0x00000001      __TEXT  __text
0x100003FB8     0x00000000      __TEXT  obj
0x100003FB8     0x00000048      __TEXT  __unwind_info
0x100004000     0x00000001      __DATA  __common
0x100003FB7     0x00000001      [  1] _main
0x100003FB8     0x00000000      [  2] _foo
0x100003FB8     0x00000048      [  0] compact unwind info
0x100004000     0x00000001      [  1] _number
```

Perf numbers when linking chromium framework on a 16-Core Intel Xeon W Mac Pro:
```
base           diff           difference (95% CI)
sys_time   1.406 ± 0.020  1.388 ± 0.019  [  -1.9% ..   -0.6%]
user_time  5.557 ± 0.023  5.914 ± 0.020  [  +6.2% ..   +6.6%]
wall_time  4.455 ± 0.041  4.436 ± 0.035  [  -0.8% ..   -0.0%]
samples    35             35
```

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D114735
2022-01-20 12:16:37 -08:00
Alexandre Ganea 83d59e05b2 Re-land [LLD] Remove global state in lldCommon
Move all variables at file-scope or function-static-scope into a hosting structure (lld::CommonLinkerContext) that lives at lldMain()-scope. Drivers will inherit from this structure and add their own global state, in the same way as for the existing COFFLinkerContext.

See discussion in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151184.html

The previous land f860fe3622 caused issues in https://lab.llvm.org/buildbot/#/builders/123/builds/8383, fixed by 22ee510dac.

Differential Revision: https://reviews.llvm.org/D108850
2022-01-20 14:53:26 -05:00
John Ericson df31ff1b29 [cmake] Make include(GNUInstallDirs) always below project(..)
Its defaulting logic must go after `project(..)` to work correctly,  but `project(..)` is often in a standalone condition making this
awkward, since the rest of the condition code may also need GNUInstallDirs.

The good thing is there are the various standalone booleans, which I had missed before. This makes splitting the conditional blocks less awkward.

Reviewed By: arichardson, phosek, beanz, ldionne, #libunwind, #libc, #libc_abi

Differential Revision: https://reviews.llvm.org/D117639
2022-01-20 18:59:17 +00:00
Sam Clegg feddf11502 [lld][WebAssemlby] Convert test to check disassembly output. NFC
Differential Revision: https://reviews.llvm.org/D117739
2022-01-20 09:32:01 -08:00
Adrian Prantl 54ba376d08 Add missing include to fix modular build 2022-01-20 08:33:44 -08:00
Jez Ng 8f811effac [lld-macho] Fix grammar in doc 2022-01-19 23:59:35 -08:00
Fangrui Song a7a4115bf3 [ELF] Replace .zdebug string comparison with SHF_COMPRESSED check. NFC 2022-01-19 22:33:32 -08:00
Fangrui Song 03909c4400 [ELF] Remove StringRefZ
StringRefZ does not improve performance. Non-local symbols always have eagerly
computed nameSize. Most local symbols's lengths will be updated in either:

* shouldKeepInSymtab
* SymbolTableBaseSection::addSymbol

Its benefit is offsetted by strlen in every call site (sums up to 5KiB code in a
release x86-64 build), so using StringRefZ may be slower.

In a -s link (uncommon) there is minor speedup, like ~0.3% for clang and chrome.

Reviewed By: alexander-shaposhnikov

Differential Revision: https://reviews.llvm.org/D117644
2022-01-19 20:09:41 -08:00
Alexandre Ganea aba5b91b69 Re-land [CodeView] Add full repro to LF_BUILDINFO record
This patch writes the full -cc1 command into the resulting .OBJ, like MSVC does. This allows for external tools (Recode, Live++) to rebuild a source file without any external dependency but the .OBJ itself (other than the compiler) and without knowledge of the build system.

The LF_BUILDINFO record stores a full path to the compiler, the PWD (CWD at program startup), a relative or absolute path to the source, and the full CC1 command line. The stored command line is self-standing (does not depend on the environment). In the same way, MSVC doesn't exactly store the provided command-line, but an expanded version (a somehow equivalent of CC1) which is also self-standing.

For more information see PR36198 and D43002.

Differential Revision: https://reviews.llvm.org/D80833
2022-01-19 19:44:37 -05:00
Jez Ng ef95d45138 [lld-macho] Mention string literal deduplication as a difference from ld64
Reviewed By: keith

Differential Revision: https://reviews.llvm.org/D117250
2022-01-19 16:30:52 -08:00
Keith Smiley 3f38dc5c04 [lld-macho] Silence XAR deprecation warning
If you're building this on macOS 12.x+ this produces a deprecation
warning. I'm not sure what this means for the bitcode format going
forward, but it seems safe to silence for now.

Do we need to worry about GCC for this?

Differential Revision: https://reviews.llvm.org/D117718
2022-01-19 13:51:55 -08:00
Keith Smiley 67090e3446 [lld-macho] Implement -noall_load
This flag is the default, so in ld64 it is not implemented, but it can
be useful to negate previous -all_load arguments. Specifically if your
build system has some global linker flags, that you may want to negate
for specific links. We use something like this today to make sure some
C++ symbols are automatically discovered for all links, which passing
-all_load hides.

Differential Revision: https://reviews.llvm.org/D117629
2022-01-19 13:12:18 -08:00
Fangrui Song 5bd38a2826 [ELF] Fix split-stack caller with hidden non-split-stack callee
Fix a regression after aabe901d57 (`[ELF] Remove
one redundant computeBinding`): isLocal() does not indicate that the symbol is
originally local. For simplicity, just drop this optimization.
2022-01-19 12:25:01 -08:00
Fangrui Song 0aae2bf373 [lld-macho] Add --start-lib --end-lib
In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is
given archive semantics. --end-lib closes the previous --start-lib. A build
system can use this feature as an alternative to archives. This patch ports
the feature to lld-macho.

--start-lib and --end-lib are positional, unlike usual ld64 options.
I think the slight drawback does not matter as (a) reusing option names
make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more
information than an alternative design: `-objlib a.o -objlib b.o` because
--start-lib makes it clear which objects are in the same conceptual archive.
This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird.

Close https://github.com/llvm/llvm-project/issues/52931

Reviewed By: #lld-macho, Jez Ng, oontvoo

Differential Revision: https://reviews.llvm.org/D116913
2022-01-19 10:14:49 -08:00
Fangrui Song d838bf2adc [ELF] Allow non-bitcode archive with an empty index
When an archive with an empty index contains only bitcode files, it is
handled as a group of lazy (--start-lib) object files. If there is a
non-bitcode file, there will be a diagnostic a la GNU ld.

For some programs, the archive member extraction ratio is high (e.g. for chrome,
79% archive members are extracted according to --print-archive-stats=). Because
symbol interning is cached for ObjFile::parseLazy but not for ArchiveFile,
parsing an archive as a group of --start-lib object files may be faster.

If the linker speculatively creates section representations for archive members,
the archive index will not be used.

If we take the above view, the archive index is essentially useless. If a user
wants a fast build without using --start-lib, they may just build thin archives
without index (`ar rcS --thin`).

Therefore, I suggest that we no longer treat the code as a hack, instead as a
supported feature. I believe we will do this anyway if we add parallel symbol
interning (parallel symbol interning for lazy object files is simpler than that
for archives).

Ecosystem issues:

* parseLazy actually has nearly the same behavior as ArchiveFile::parse, but the symbol order may be different.
* users may get addicted to the behavior and build archives not working with GNU ld and gold. I think it is easy to rebuild archives to be compatible.

Reviewed By: ikudrin

Differential Revision: https://reviews.llvm.org/D117284
2022-01-19 10:01:53 -08:00
Ayke van Laethem d649faff9c
[LLD][COFF] Support GNU style == aliases
D46245 added support for this in llvm-libtool, but while lld-link can
also create .lib files from .def files it didn't support aliases.

I compared the Inputs/library.def test against the output from
llvm-libtool and it matches, except for the fact that lld-link reorders
functions for some reason.

I have also verified that this fixes a bug I was running into while
trying to compile .def files to .lib files in MinGW-w64 (using lld-link
instead of llvm-libtool).

Differential Revision: https://reviews.llvm.org/D113365
2022-01-19 14:22:13 +01:00
Fangrui Song 288082d45d [ELF] Move SHT_REL/SHT_RELA handling from createInputSection to initializeSections
This simplifies the code a bit. While here,

* change the `multiple relocation sections` diagnostic from `fatal` to `error` and include the relocated section name.
* drop less useful name from `getRelocTarget`. Without -r/--emit-relocs we don't need to get SHT_REL/SHT_RELA names.
2022-01-18 23:31:51 -08:00
Fangrui Song 84944b63f3 [ELF] Simplify ObjFile<ELFT>::initializeSections. NFC 2022-01-18 22:45:04 -08:00
Fangrui Song 5f404a749a [ELF] De-template InputSectionBase::getLocation. NFC 2022-01-18 17:33:58 -08:00
Fangrui Song eafd34581f [ELF] Simplify/optimize EhInputSection::split
and change some `fatal` to `errorOrWarn`.

EhFrame.cpp is a helper file. We don't place all .eh_frame implementation there,
so the code move is fine.
2022-01-18 17:03:23 -08:00
Vincent Lee e5347f2556 [lld-macho] Allow deduplicate-literals to be overridden
It's still uncertain but whether we want to have `deduplicate-literals` be the
default flag for LLD out of the box or not. If `deduplicate-literals` is the default
behavior, then we will need a way override it and not deduplicate. Luckily, we
have `no_deduplicate` to fill this gap. For now, I've set the default to be false
which aligns with the existing behavior. That can only always be changed after
discussions on D117250.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D117387
2022-01-18 15:42:59 -08:00
Sam Clegg ec47dba1c8 [lld][WebAssembly] Perform data relocations during start function
We already perform memory initialization and apply global relocations
during start.  It makes sense to performs data relocations too.  I think
the reason we were not doing this already is solely historical.

Differential Revision: https://reviews.llvm.org/D117412
2022-01-18 14:08:42 -08:00
Sam Clegg ae1573e131 [lld][WebAssembly] Reinstate mistakenly disabled test. NFC
It seems the first half of this test was disabled in error
as part of https://reviews.llvm.org/D93066.

Differential Revision: https://reviews.llvm.org/D117594
2022-01-18 12:22:22 -08:00
Alexander Shaposhnikov 2bb7f226af [lld] Fix typo. NFC 2022-01-18 02:33:27 +00:00
Fangrui Song 83c7f5d3fb [ELF] EhInputSection::split: remove unneeded check 2022-01-17 13:59:52 -08:00
Fangrui Song ac0986f880 [ELF] Change std::vector<InputSectionBase *> to SmallVector
There is no remaining std::vector<InputSectionBase> now. My x86-64 lld
executable is 2KiB small.
2022-01-17 10:25:07 -08:00
Fangrui Song f855074ed1 [ELF] GnuHashTableSection: replace stable_sort with 2-key sort. NFC
strTabOffset stabilizes llvm::sort. My x86-64 executable is 5+KiB smaller.
2022-01-17 00:34:42 -08:00
Fangrui Song 54fe70bfba [ELF] RelocationScanner::scanOne: replace rel.r_offset with offset. NFC 2022-01-17 00:05:27 -08:00
Fangrui Song 4c36567179 [ELF] Relocations: remove some cast<Undefined>. NFC 2022-01-17 00:02:47 -08:00
Fangrui Song b8d4eb84d7 [ELF] De-template getAlternativeSpelling. NFC 2022-01-16 23:56:25 -08:00
Fangrui Song 9c4292a59d [ELF] Remove unneeded SyntheticSection memset(*, 0, *)
After the D33630 fallout was properly fixed by a4c5db30be.

Tested by D37462/D44986 tests, the new --no-rosegment test in build-id.s, and a few --rosegment/--no-rosegment programs.
2022-01-16 22:51:57 -08:00
Fangrui Song a4c5db30be [ELF] Remove redundant fillTrap and memset(*, 0, *). NFC
The new tests in build-id.s would catch problems if we made a mistake here.
2022-01-16 22:37:31 -08:00
Fangrui Song d46054d75d [ELF][test] Add --build-id tests for -z separate-loadable-segments and --no-rosegment 2022-01-16 22:36:22 -08:00
Fangrui Song aad90763d9 [ELF] RelocationSection<ELFT>::writeTo: use unstable partition 2022-01-16 21:44:19 -08:00