Commit Graph

780 Commits

Author SHA1 Message Date
Maksim Panchenko 9f15b9f3c2 [BOLT] Fix debug line info in lite relocation mode
Summary: Emit line info for functions that were not emitted in relocation mode.

(cherry picked from FBD24267650)
2020-10-12 20:16:59 -07:00
Alexander Shaposhnikov 473a6199ab Add first bits to support emitting instrumented code on MachO
Summary:
Add first bits to support emitting instrumented code on MachO.
This diff enables us to instrument branches / emit counters.

(cherry picked from FBD24255164)
2020-10-12 10:11:17 -07:00
Maksim Panchenko 247b4181a3 [BOLT] Emit symbol size for functions
Summary:
On targets that support it, emit size of the emitted function symbol.

At the moment there's no use for the size except that it is visible in a
temporary .o file symbol table.

(cherry picked from FBD24246177)
2020-10-12 13:02:50 -07:00
Alexander Shaposhnikov 528da5d795 Fix handling of _end symbol on MachO
Summary: _end is "defined" but its address doesn't belong to any section. This diff adds special handling for this symbol.

(cherry picked from FBD24249120)
2020-10-12 03:56:50 -07:00
Maksim Panchenko c27e254056 [BOLT] Change label name for cold fragments
Summary:
Append ".cold.0" suffix to the original part of the name, such that
"foo/1" becomes "foo.cold.0/1" instead of "foo/1.cold.0".

(cherry picked from FBD24246112)
2020-10-12 11:26:07 -07:00
Alexander Shaposhnikov 7f1fd80762 Add support for emitting code into a new segment on MachO
Summary: Add support for emitting code into a new segment on MachO (in the instrumentation mode).

(cherry picked from FBD24097547)
2020-10-02 19:25:17 -07:00
Maksim Panchenko 843309c075 [BOLT] Disable PatchEntries in non-relocation mode on ELF
Summary:
At the moment we are not using PatchEntries pass in non-relocation mode
on ELF. However, we will use it on MachO.

(cherry picked from FBD24235271)
2020-10-09 19:37:12 -07:00
Maksim Panchenko 0465d952cc [BOLT] Refactor PatchEntries pass
Summary:
Use injected functions with fixed addresses to patch original function
entries.

(cherry picked from FBD24133890)
2020-10-09 16:06:27 -07:00
Alexander Shaposhnikov 0376abe252 Add ToolPath field to MachORewriteInstance
Summary: Add ToolPath field to MachORewriteInstance. This will enable us to locate the runtime library relative to the tool's location.

(cherry picked from FBD24183448)
2020-10-07 17:52:47 -07:00
Rafael Auler 35632d4828 [BOLT] Refactor relocations class impl per arch, NFC
Summary:
Do not mix relocation codes from different archs. Even though
they do not intersect at the moment, this could easily introduce bugs
once new relocations are supported (for example, ILP32 for AArch64).

(cherry picked from FBD24169425)
2020-10-07 15:40:51 -07:00
Alexander Shaposhnikov 59c21b42da Precompute symbol section indices on MachO
Summary: Precompute symbol section indices on Mach-O.

(cherry picked from FBD24133810)
2020-10-06 01:30:55 -07:00
Alexander Shaposhnikov 71e185f2da Add -check-overlapping-elements option
Summary:
This diff adds a command line option to disable the check of overlapping elements in Mach-O parsing. This check in its current form is prohibitively expensive for large binaries.
A long-term fix would be to reimplement the check in a more efficient manner (and contribute it to the upstream).

(cherry picked from FBD24109468)
2020-10-05 02:35:26 -07:00
Rafael Auler d7fb998637 [BOLT] Fix sign issue when validating X86 relocations
Summary:
In analyzeRelocations, we extract the result of the relocation
from binary code to recreate the target of it in a few special cases.
For R_X86_64_32S relocations, however, we were neglecting the
possibility of the encoded value in the instruction to be negative.

(cherry picked from FBD24096347)
2020-10-05 12:41:03 -07:00
Alexander Shaposhnikov 2808c800e8 Read the entry point address on MachO
Summary: Read the entry point address on MachO

(cherry picked from FBD24039370)
2020-09-30 19:10:24 -07:00
Amir Ayupov d1ec11b28f postProcessEntryPoints: return after setIgnored and setSimple are set
Summary:
This patch fixes the assertion failure during instrumentation.

The assertion is raised by `getInstructionAtOffset` , which expects `CurrentState` to be either `Disassembled` or `CFG`.

The function is called from `postProcessEntryPoints`, which goes over Labels and performs a series of checks. The checks call BinaryFunction methods `setSimple(false)` or `setIgnored()`.
However, if `setIgnored` is invoked, it resets the state to `Empty`. Thus subsequent call to `getInstructionAtOffset` will fail.

(cherry picked from FBD24005197)
2020-09-29 19:37:47 -07:00
Alexander Shaposhnikov 0601ae6438 Set InputFileOffset for MachO sections
Summary: Set InputFileOffset for MachO sections.

(cherry picked from FBD23903542)
2020-09-24 03:22:31 -07:00
Maksim Panchenko a10f799290 [BOLT][Linux] Initial support for special Linux Kernel sections
Summary:
Enable initial support for reading and patching special Linux kernel sections.

Author: Tanvir Ahmed Khan <takh@fb.com>

GitHub Author: takhandipu

(cherry picked from FBD22998869)
2020-09-15 11:42:03 -07:00
Maksim Panchenko a82cff0f52 [BOLT] Eliminate "shallow" function lookup
Summary:
Whenever we search for a function based on its address in the input
binary, we now always return a corresponding fragment for split
functions. If the user needs an access to the main fragment, they can
call getTopmostFragment().

(cherry picked from FBD23670311)
2020-09-14 15:48:32 -07:00
Maksim Panchenko 62469b5036 [BOLT] Do no map sections with zero address
Summary:
Sections that do not originate from the input binary will have an
input address set to zero and thus do not have to be mapped.

Mapping such sections caused a build time regression in non-relocation
mode.

(cherry picked from FBD23670334)
2020-09-14 14:31:50 -07:00
Amir Ayupov 8c4ba8f165 Bugfix for splitting critical edges in shrink wrapping
Summary:
Fix issue with splitting critical edges originating at
the same BB in ShrinkWrapping::splitFrontierCritEdges.

Splitting of critical edges originating at the same FromBB
wasn't handled correctly as the Frontier at index corresponding
to FromBB was overwritten with basic blocks created for
multiple DestinationBBs.

(cherry picked from FBD23232398)
2020-08-20 19:00:29 -07:00
Rafael Auler 9bc4a8db18 Fix BAT cold-to-hot mappings
Summary:
Right now, if activity is recorded in cold parts, we write to
the .fdata file the ".cold" name instead of the correct name of the
function. Fix this.

(cherry picked from FBD23148705)
2020-08-18 11:55:56 -07:00
Maksim Panchenko aaf49b095f [perf2bolt] Issue error when writing YAML for BOLTed input
Summary:
When the input file is processed by BOLT, we cannot save profile in YAML
format as it requires CFG representation of functions.

(cherry picked from FBD22941794)
2020-08-12 18:10:41 -07:00
Alexander Shaposhnikov 8b989765f6 Add first bits to support emitting more than 255 sections on MachO
Summary:
Add first bits to support emitting more than 255 sections.
Update llvm.patch to include the changes in MachOObjectFile.cpp.

(cherry picked from FBD22655053)
2020-07-21 17:26:00 -07:00
Rafael Auler 6547813d1f Print when we are operating in lite mode
Summary:

(cherry picked from FBD22968343)
2020-08-06 14:43:33 -07:00
takh 0033a7612d Linux kernel marker to update special sections
Summary: This diff adds SDT marker like LK marker to update special lk sections

(cherry picked from FBD22932157)
2020-08-04 13:50:00 -07:00
Maksim Panchenko 8f2a962866 [perf2bolt] Fix for SKL bug workaround
Summary:
Some LBR traces contain less than 16/32 entries. When the first trace is
less than the standard length, we fail to enable SKL bug workaround with
BAT mode enabled. The result is a bad profile from perf2bolt. The fix
is to check the length of every trace, not just the first one.

Issue facebookincubator/BOLT#94

(cherry picked from FBD22917971)
2020-08-04 10:59:37 -07:00
Amir Ayupov 8f7cb54ae5 Added execution count threshold option
Summary:
Added execution count threshold option (execution-count-threshold) controlling the optimizations that are sensitive to the accuracy of the profiling data:
- BB reordering
- function splitting
- frame opts
- shrink wrapping
- indirect call promotion

(cherry picked from FBD22682171)
2020-07-27 18:07:18 -07:00
Rafael Auler c6799a689d [BOLT] Fix stack alignment for runtime lib
Summary:
Right now, the SAVE_ALL sequence executed upon entry of both
of our runtime libs (hugify and instrumentation) will cause the stack to
not be aligned at a 16B boundary because it saves 15 8-byte regs. Change
the code sequence to adjust for that. The compiler may generate code
that assumes the stack is aligned by using movaps instructions, which
will crash.

(cherry picked from FBD22744307)
2020-07-27 16:52:51 -07:00
Rafael Auler ed02946281 [BOLT] Fix hot_end symbol update with user function order
Summary:
If no profile data is provided, but only a user-provided order
file for functions, fix the placement of the __hot_end symbol.

(cherry picked from FBD22713265)
2020-07-24 10:28:36 -07:00
Amir Ayupov 6b89a9cb44 Handle intra-function call in instrumentOneTarget
Summary: Added handling of intra-function calls (internal control transfer by call instruction) to instrumentOneTarget

(cherry picked from FBD22606033)
2020-07-17 23:16:56 -07:00
Amir Ayupov f7d4bed9d1 Extracted sequence insertion function into helper function
Summary: Factored out common code from multiple places into a helper function

(cherry picked from FBD22606101)
2020-07-17 23:16:52 -07:00
Maksim Panchenko 937244b4f2 [BOLT] Allow to specify -reorder-functions option multiple times
Summary: Need to be able to override the option.

(cherry picked from FBD22583585)
2020-07-17 10:08:51 -07:00
Rafael Auler 6c8fc28892 Revert "[BOLT] Add the FeatureMiner pass to extract Calder's features."
This reverts commit 2476f46af02ccce04e9ed456462dd098460e4e1f.

Reviewed By: maks

(cherry picked from FBD28111787)
2020-07-16 17:35:55 -07:00
Rafael Auler 170f73ac9e [BOLT] Fix fix-branches in presence of JRCXZ and friends
Summary:
Do not fail/assert when trying to reorder blocks that terminate
with JRCXZ/JECXZ/LOOP instructions. We cannot invert the condition of
these instructions, so just treat them accordingly in fixBranches().

(cherry picked from FBD22487107)
2020-07-15 23:02:58 -07:00
Angélica Moreira 181327d763 [BOLT] Add the FeatureMiner pass to extract Calder's features.
(cherry picked from FBD19844247)
2020-07-07 23:01:22 -07:00
Tanvir Ahmed Khan f40ffa0dc8 Report stale sample count and percentage
Summary: This diff adds extra reporting of total number of stale branch samples for the binary.

(cherry picked from FBD22304965)
2020-07-06 21:35:44 -07:00
Maksim Panchenko 3e795c8a5f [BOLT] Ignore addresses from non-allocatable sections
Summary:
We've accidentally registered TBSS section address with a BinaryContext
resulting in addresses being attributed to it when
getSectionForAddress() was called.

(cherry picked from FBD22369323)
2020-07-06 14:39:44 -07:00
takh a9fac6a89f Support for CDF distribution of heatmap buckets
Summary: This diff adds the support for generating CDF distributions of heatmap buckets.

(cherry picked from FBD22128291)
2020-06-18 16:47:21 -07:00
Xun Li 84eae1a413 [Bolt] Improve coding style for runtime lib related code
Summary:
Reading through the LLVM coding standard again, realized a few places where I didn't follow the standard when coding. Addressing them:
1. prefer static functions over functions in unnamed namespace.
2. #include as little as possible in headers
3. Have vtable anchors.

(cherry picked from FBD22353046)
2020-07-02 14:28:13 -07:00
Maksim Panchenko e233dec467 [BOLT] Skip R_X86_64_PLT32 relocation verification
Summary:
R_X86_64_PLT32 relocations recorded by the linker may point to the PLT
section instead of being resolved to the symbol reported by the
relocation. Sometimes they could point to the symbol too. Disable
internal verification for this type of relocation.

Include a fix for symbol address calculation when it is based on the
extracted value. The truncation to the relocation size is needed if
the results overflows.

(cherry picked from FBD22317952)
2020-06-30 19:58:43 -07:00
Rafael Auler 26ad0bd951 [TESTS] Re-add issue20/issue26 tests
Summary:
Re-add tests removed because they used to depend on yaml2obj.
Rewrite them with an assembler (llvm-mc) and use the system linker to
produce a valid ELF as input to BOLT.

(cherry picked from FBD22323449)
2020-06-30 18:36:49 -07:00
Rafael Auler 41cb6b68ed Update X86/pre-aggregated-perf.test
Summary:
Add REQUIRED statement.

(cherry picked from FBD22290759)
2020-06-24 18:24:07 -07:00
Maksim Panchenko ffaba22476 [BOLT] Do not emit duplicate org symbols
Summary:
When adding symbols for patched functions, we may end up emitting
multiple symbols per function if the function has multiple names (e.g.
after identical code folding by the linker).

(cherry picked from FBD22294112)
2020-06-24 12:36:15 -07:00
Maksim Panchenko 250ca4082e [BOLT] Add static binary support
Summary:
Accept binaries without dynamic section/segment as a valid input.

Modify the check for invalid debug info "executables" that are result of
running "objcopy --only-keep-debug". Instead of checking for an empty
dynamic segment, check that ".text" is mapped into a valid segment.

Move SegmentMapInfo inside BinaryContext.

Fixes facebookincubator/BOLT#91

Temporarily removing issue*.test tests that use yaml2obj and operate on
fake binaries.

(cherry picked from FBD22271481)
2020-06-26 16:52:07 -07:00
Maksim Panchenko 94230a2c07 [perf2bolt] Relax rules for aggregation in strict mode
Summary:
While aggregating perf.data events, even in strict mode, there is no
need to process all functions since we are not generating an output
binary. However, it's still important to convert data for as many
functions as possible, even for ones with unknown internal control flow.

(cherry picked from FBD22248390)
2020-06-25 16:29:17 -07:00
Maksim Panchenko 4aaa8892dd [BOLT] Ignore duplicate relocations
Summary:
lld linker may emit static relocations against addresses that also have
dynamic relocations associated with them. When this happens, BOLT fails
to validate the extracted value at the address.

Read dynamic relocations in the binary and ignore static relocations at
addresses that have a duplicate dynamic relocation.

(cherry picked from FBD22192345)
2020-06-23 12:22:58 -07:00
Maksim Panchenko 13baf47a3c [BOLT] Add '-force-patch' to forcefully patch old entries
Summary:
The option is useful for debugging.

Also, print personality function when dumping a function.

(cherry picked from FBD22169482)
2020-06-22 13:08:28 -07:00
Maksim Panchenko 4946b881a8 [BOLT] Fix getNewValueForSymbol()
Summary:
getNewValueForSymbol() uses orc::RTDyldObjectLinkingLayer::findSymbol()
to resolve symbol values. The latter will always return JITSymbol,
even if there was no symbol defined. The address for the undefined
symbol will be zero, but some symbols could legally be resolved to zero
too.

We need to distinguish between real zero-valued symbols and symbols that
were not emitted and are not visible by orc::RTDyldObjectLinkingLayer.
If zero address is returned by ORC, check for a binary data with the
same name and use its address for the symbol resolution.

(cherry picked from FBD22175269)
2020-06-22 16:16:08 -07:00
Maksim Panchenko ae296ea665 [BOLT] Allow to overwrite -use-old-text option
(cherry picked from FBD22169409)
2020-06-22 14:05:19 -07:00
Maksim Panchenko 12b7987d4f [BOLT] Ignore functions that failed validation
Summary:
If a function failed internal calls validation, we can ignore it and
keep processing the binary.

(cherry picked from FBD22169381)
2020-06-22 12:59:03 -07:00