Handle the case where LLVM_REVISION is undefined (due to LLVM_APPEND_VC_REV=OFF
or otherwise) by setting "<unknown>" value as before D123549.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D123852
When processing profile data for shared object or PIE, perf2bolt needs
to calculate base address of the binary based on the map info reported
by the perf tool. When the mapping data provided is for the second
(or any other than the first) segment and the segment's file offset
does not match its memory offset, perf2bolt uses wrong assumption
about the binary base address.
Add a function to calculate binary base address using the reported
memory mapping and use the returned base for further address
adjustments.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D123755
The ld might relax ADRP+ADD or ADRP+LDR sequences to the ADR+NOP, add
the new case to the skipRelocation for aarch64.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D123334
BOLT expects PC-relative relocations in data sections to reference code
and the relocated data to form a jump table. However, there are cases
where PC-relative addressing is used for data-to-data references
(e.g. clang-15 can generate such code). BOLT should recognize and ignore
such relocations. Otherwise, they will be considered relocations not
claimed by any jump table and cause a failure in the strict mode.
Reviewed By: yota9, Amir
Differential Revision: https://reviews.llvm.org/D123650
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:
* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`
As part of this patch also:
* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.
Differential Revision: https://reviews.llvm.org/D123100
Add !isTailCall in isUnconditionalBranch check in order to sync the x86
and aarch64 and fix the fixDoubleJumps pass on aarch64.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122929
The bfd linker adds the symbol versioning string to the symbol name in symtab.
Skip the versioning part in order to find the registered PLT function.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122039
Read static relocs on the same address, as dynamic in order to update
constant island data address properly.
Differential Revision: https://reviews.llvm.org/D122100
Check that the function will be emitted in the final binary. Preserving
old function address is needed in case it is PLT trampiline, that is
currently not moved by the BOLT.
Differential Revision: https://reviews.llvm.org/D122098
BOLT treats aarch64 objects located in text as empty functions with
contant islands. Emit them with at least 8-byte alignment to the new
text section.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D122097
AArch64 requires CI to be aligned to 8 bytes due to access instructions
restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes.
Differential Revision: https://reviews.llvm.org/D122065
It seems the earlier implementation does not follow the description
in LoopRotationPass.h: It rotates loops even if they are already laid out
correctly. The diff adjusts the behaviour.
Given that the impact of LoopInversionPass is minor, this change won't
yield significant perf differences. Tested on clang-10: there seems to be a
0.1%-0.3% cpu win and a small reduction of branch misses.
**Before:**
BOLT-INFO: 120 Functions were reordered by LoopInversionPass
**After:**
BOLT-INFO: 79 Functions were reordered by LoopInversionPass
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D121921
Run tentativeLayoutRelocMode twice only if UseOldText option was passed.
Refactor BF loop to break on condtition met.
Differential Revision: https://reviews.llvm.org/D121825
Remove tables from X86MCPlusBuilder, make use of llvm::X86 mnemonic tables.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D121573
Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it
in BOLT explicitly (-x86-strip-redundant-adsize, on by default).
Test Plan:
`bin/llvm-lit -a bolt/test/X86/addr32.s`
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120975
The BinaryEmitter uses opts::AlignText value to align the hot text
section. Also check that the opts::AlignText is at least
equal opts::AlignFunctions for the same reason, as described in D121392.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D121728
The cold text section alignment is set using the maximum alignment value
passed to the emitCodeAlignment. In order to calculate tentetive layout
right we will set the minimum alignment of such sections to the maximum
possible function alignment explicitly.
Differential Revision: https://reviews.llvm.org/D121392
For AArch64 in some cases/some distributions ld uses 64K alignment of LOAD segments by default.
Reviewed By: yota9, maksfb
Differential Revision: https://reviews.llvm.org/D119267
The aarch64 uses the trampolines located in .iplt section, which
contains plt-like trampolines on the value stored in .got. In this case
we don't have JUMP_SLOT relocation, but we have a symbol that belongs to
ifunc trampoline, so use it and set set plt symbol for such functions.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D120850
Address fuzzer crash on malformed input:
```
BOLT-ERROR: cannot get section contents for .dynsym: The end of the file was unexpectedly encountered.
```
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D121068
This patch enables PLT analysis for aarch64. It is used by the static
relocations in order to provide final symbol address of PLT entry for some
instructions like ADRP.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Differential Revision: https://reviews.llvm.org/D118088
Reassigning the operand didn't update the operand type which resulted in an
assertion (`Assertion `isReg() && "This is not a register operand!"' failed.`)
Reset the instruction instead.
Test Plan:
```
ninja check-bolt
...
PASS: BOLT-Unit :: Core/./CoreTests/X86/MCPlusBuilderTester.ReplaceRegWithImm/0 (90 of 136)
```
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120263
We were not handling correctly conversion from DW_AT_high_pc into DW_AT_ranges,
when size of DW_AT_high_pc is not 4/8 bytes.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D120528
PC-relative memory operand could reference a different object from
the one located at the target address, e.g. when a negative offset
is used. Check relocations for the real referenced object.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120379
Further improve error handling in BOLT by reporting `RewriteInstance` errors in
a library and fuzzer-friendly way instead of exiting.
Follow-up to D119658
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D120224
Fix data race reported by ThreadSanitizer in clang.test:
```
ThreadSanitizer: data race /data/llvm-project/bolt/lib/Passes/ShrinkWrapping.cpp:1359:28
in llvm::bolt::ShrinkWrapping::moveSaveRestores()
```
The issue is with incrementing global counters from multiple threads.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D120218
Refactor createBinaryContext and RewriteInstance/MachORewriteInstance
constructors to report an error in a library and fuzzer-friendly way instead of
returning a nullptr or exiting.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D119658
This patch changes patchELFAllocatableRelaSections from going through
old relocations sections and update the relocation offsets to emitting
the relocations stored in binary sections. This is needed in case we
would like to remove and add dynamic relocations during BOLT work and it
is used by golang support pass. Note: Currently we emit relocations in
the old sections, so the total number of them should be equal or less
of old number.
Testing: No special tests are neeeded, since this patch does not fix
anything or add new functionality (it only prepares to add). Every
PIC-compiled test binary will use this code and thus become a test.
But just in case the aarch64 dynamic relocations tests were added.
Vladislav Khmelevsky,
Advanced Software Technology Lab, Huawei
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D117612
Added ability to append new entries to DIE. This is useful to standadize DWARF4
Split Dwarf, and simplify implementation of DWARF5.
Multiple DIEs can share an abbrev. So currently limitation is that only unique
Attributes can be added.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D119577