llvm-project

Commit Graph

Author	SHA1	Message	Date
Huan Nguyen	8eb68d92d4	[BOLT] Handle broken .dynsym in stripped binaries Strip tools cause a few symbols in .dynsym to have bad section index. This update safely keeps such broken symbols intact. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130073	2022-07-22 11:24:09 -07:00
zr33	a2035c566f	[BOLT][DWARF] Fix bolt/test/X86/shared-abbrev.s There should not be a end of child mark before DW_AT_ranges, removed it and fixed unit offset. Reviewed By: ayermolo Differential Revision: https://reviews.llvm.org/D130335	2022-07-22 10:45:28 -07:00
Maksim Panchenko	661577b5f4	[BOLT] Add support for the latest perf tool The latest perf tool can return non-empty buffer when executing buildid-list command, even when perf.data was recorded with -B flag. Some binaries will be listed without the ID, while others may have a recorded ID. Allow invalid entires on the input, while checking the valid ones for the match. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130223	2022-07-22 07:56:15 -07:00
John Ericson	07b749800c	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in `d0e1c2a550` to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-07-21 19:04:00 +00:00
Sriraman Tallam	116ee23f4c	[bolt] std::atomic_uint64_t to std::atomic<uint64_t> Differential Revision: https://reviews.llvm.org/D129903	2022-07-19 16:09:11 -07:00
zr33	1a1324a303	[BOLT][DWARF] Fix incorrect DW_AT_type offset for unittest Some unit tests has incorrect DW_AT_type offset since they are manual crafted, fix them to the correct offset. Reviewed By: Amir, ayermolo Differential Revision: https://reviews.llvm.org/D129828	2022-07-18 14:20:22 -07:00
zr33	66a41e0807	[BOLT][DWARF] Add Unit test for DW_AT_high_pc [DW_FORM_addr] Reviewed By: ayermolo Differential Revision: https://reviews.llvm.org/D127613	2022-07-18 14:03:53 -07:00
Fabian Parzefall	8477bc6761	[BOLT] Add function layout class This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split). Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129518	2022-07-16 17:23:24 -07:00
Amir Ayupov	77b72fbc71	[BOLT][TEST] Add icp-inline.s test Add a test for `-icp-inline` knob, which ensures that ICP is only performed for functions that can be subsequently inlined. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D129803	2022-07-15 20:49:26 -07:00
Fangrui Song	d11ac9641b	[bolt] Include <atomic>	2022-07-15 14:27:01 -07:00
Huan Nguyen	ae563c9146	[BOLT] Support split landing pad We previously support split jump table, where some jump table entries target different fragments of same function. In this fix, we provide support for another type of intra-indirect transfer: landing pad. When C++ exception handling is used, compiler emits .gcc_except_table that describes the location of catch block (landing pad) for specific range that potentially invokes a throw(). Normally landing pads reside in the function, but with -fsplit-machine-functions, landing pads can be moved to another fragment. The intuition is, landing pads are rarely executed, so compiler can move them to .cold section. This update will mark all fragments that have landing pad to another fragment as non-simple, and later propagate non-simple to all related fragments. This update also includes one manual test case: split-landing-pad.s Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128561	2022-07-14 18:10:22 -07:00
Fabian Parzefall	d55dfeaf32	[BOLT] Replace uses of layout with basic block list As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of basic blocks in the final layout, should iterate over binary functions directly, rather than the layout. Eventually, all loops using the layout list should either iterate over the function, or be aware of multiple layouts. This patch replaces references to binary function's block layout with the binary function itself where only little code changes are necessary. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129585	2022-07-14 13:07:05 -07:00
Huan Nguyen	05523dc32d	[BOLT] Support multiple parents for split jump table There are two assumptions regarding jump table: (a) It is accessed by only one fragment, say, Parent (b) All entries target instructions in Parent For (a), BOLT stores jump table entries as relative offset to Parent. For (b), BOLT treats jump table entries target somewhere out of Parent as INVALID_OFFSET, including fragment of same split function. In this update, we extend (a) and (b) to include fragment of same split functinon. For (a), we store jump table entries in absolute offset instead. In addition, jump table will store all fragments that access it. A fragment uses this information to only create label for jump table entries that target to that fragment. For (b), using absolute offset allows jump table entries to target fragments of same split function, i.e., extend support for split jump table. This can be done using relocation (fragment start/size) and fragment detection heuristics (e.g., using symbol name pattern for non-stripped binaries). For jump table targets that can only be reached by one fragment, we mark them as local label; otherwise, they would be the secondary function entry to the target fragment. Test Plan ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128474	2022-07-13 23:37:31 -07:00
Vladislav Khmelevsky	35efe1d806	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D129260	2022-07-13 14:47:22 +03:00
Denis Revunov	7564167885	[BOLT][AArch64] Use all supported CPU features on AArch64 Since we now have +all feature for AArch64 disassembler, we can use it in BOLT and allow it to disassemble all ARM instructions supported by LLVM. Reviewed by: rafauler Differential Revision: https://reviews.llvm.org/D129139	2022-07-12 03:56:04 -04:00
Rafael Auler	42a66fb727	[BOLT] Restrict execution of tests that fail on Windows Turn off execution of tests that use UNIX-specific features. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126933	2022-07-11 17:59:58 -07:00
Rafael Auler	a3cfdd746e	[BOLT] Increase coverage of shrink wrapping [5/5] Add -experimental-shrink-wrapping flag to control when we want to move callee-saved registers even when addresses of the stack frame are captured and used in pointer arithmetic, making it more challenging to do alias analysis to prove that we do not access optimized stack positions. This alias analysis is not yet implemented, hence, it is experimental. In practice, though, no compiler would emit code to do pointer arithmetic to access a saved callee-saved register unless there is a memory bug or we are failing to identify a callee-saved reg, so I'm not sure how useful it would be to formally prove that. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126115	2022-07-11 17:30:13 -07:00
Rafael Auler	3e5f67f356	[BOLT] Increase coverage of shrink wrapping [4/5] Change shrink-wrapping to try a priority list of save positions, instead of trying the best one and giving up if it doesn't work. This also increases coverage. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126114	2022-07-11 17:30:05 -07:00
Rafael Auler	3332904ad6	[BOLT] Increase coverage of shrink wrapping [3/5] Add the option to run -equalize-bb-counts before shrink wrapping to avoid unnecessarily optimizing some CFGs where profile is inaccurate but we can prove two blocks have the same frequency. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126113	2022-07-11 17:30:00 -07:00
Rafael Auler	3508ced6ea	[BOLT] Increase coverage of shrink wrapping [2/5] Refactor isStackAccess() to reflect updates by D126116. Now we only handle simple stack accesses and delegate the rest of the cases to getMemDataSize. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126112	2022-07-11 17:29:54 -07:00
Rafael Auler	42465efd17	[BOLT] Increase coverage of shrink wrapping [1/5] Change how function score is calculated and provide more detailed statistics when reporting back frame optimizer and shrink wrapping results. In this new statistics, we provide dynamic coverage numbers. The main metric for shrink wrapping is the number of executed stores that were saved because of shrink wrapping (push instructions that were either entirely moved away from the hot block or converted to a stack adjustment instruction). There is still a number of reduced load instructions (pop) that we are not counting at the moment. Also update alloc combiner to report dynamic numbers, as well as frame optimizer. For debugging purposes, we also include a list of top 10 functions optimized by shrink wrapping. These changes are aimed at better understanding the impact of shrink wrapping in a given binary. We also remove an assertion in dataflow analysis to do not choke on empty functions (which makes no sense). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126111	2022-07-11 17:29:22 -07:00
spupyrev	228970f612	Revert "Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"" This reverts commit `76029cc53e`.	2022-07-11 09:50:47 -07:00
spupyrev	eecd41aa09	Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type" This reverts commit `6d0528636a`.	2022-07-11 09:50:47 -07:00
spupyrev	7228371054	[BOLT] Do not merge cold and hot chains of basic blocks There is a post-processing in ext-tsp block reordering that merges some blocks into chains. This allows to maintain the original block order in the absense of profile data and can be beneficial for code size (when fallthroughs are merged). In the earlier version we could merge hot and cold (with zero execution count) chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The diff eliminates the redundant merging. It is unlikely the change will affect the performance of a binary in a measurable way, as it is mostly operates with cold basic blocks. However, after the diff the impact of split-all-cold is almost negligible and we can avoid the extra function splitting. Measuring on the clang binary (negative is good, positive is a regression): clang12 benchmark1: `0.0253` benchmark2: `-0.1843` benchmark3: `0.3234` benchmark4: `0.0333` clang10 benchmark1 `-0.2517` benchmark2 `-0.3703` benchmark3 `-0.1186` benchmark4 `-0.3822` clang7 benchmark1 `0.2526` benchmark2 `0.0500` benchmark3 `0.3024` benchmark4 `-0.0489` Overall: `-0.0671 ± 0.1172` (insignificant) Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129397	2022-07-11 09:31:52 -07:00
Maksim Panchenko	76029cc53e	Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations" Summary: This reverts commit `729d29e167`. Needed as a workaround for T112872562. Manual rebase conflict history: https://phabricator.intern.facebook.com/D35230076 https://phabricator.intern.facebook.com/D35681740 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: spupyrev Differential Revision: https://phabricator.intern.facebook.com/D37098481	2022-07-11 09:31:52 -07:00
Rafael Auler	6d0528636a	Rebase: [Facebook] [MC] Introduce NeverAlign fragment type Summary: Introduce NeverAlign fragment type. The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary. In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion). This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place. The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle. LLVM: https://reviews.llvm.org/D97982 Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: phabricatorlinter Differential Revision: https://phabricator.intern.facebook.com/D31361547	2022-07-11 09:31:52 -07:00
Vladislav Khmelevsky	e10e120cea	[BOLT][Runtime] Fix memset definition Differential Revision: https://reviews.llvm.org/D129321	2022-07-09 01:17:08 +03:00
Michał Chojnowski	bd301a418b	[BOLT] Fix concurrent hash table modification in the instrumentation runtime `__bolt_instr_data_dump()` does not lock the hash tables when iterating over them, so the iteration can happen concurrently with a modification done in another thread, when the table is in an inconsistent state. This also has been observed in practice, when it caused a segmentation fault. We fix this by locking hash tables during iteration. This is done by taking the lock in `forEachElement()`. The only other site of iteration, `resetCounters()`, has been correctly locking the table even before this patch. This patch removes its `Lock` because the lock is now taken in the inner `forEachElement()`. Reviewed By: maksfb, yota9 Differential Revision: https://reviews.llvm.org/D129089	2022-07-07 14:27:29 +03:00
Maksim Panchenko	ea2182fedd	[BOLT] Add runtime functions required by freestanding environment Compiler can generate calls to some functions implicitly, even under constraints of freestanding environment. Make sure these functions are available in our runtime objects. Fixes test failures on some systems after https://reviews.llvm.org/D128960. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D129168	2022-07-06 11:22:22 -07:00
Elvina Yakubova	35155a0716	[BOLT] Change mutex implementation Changed acquire implemetaion to __atomic_test_and_set() and release to __atomic_clear() so it eliminates inline asm usage and is arch independent. Elvina Yakubova, Advanced Software Technology Lab, Huawei Reviewers: yota9, maksfb, rafauler Differential Revision: https://reviews.llvm.org/D129162	2022-07-06 08:19:54 +03:00
Maksim Panchenko	3a47037fcc	[BOLT] Fix instrumentation problem with floating point If BOLT instrumentation runtime uses XMM registers, it can interfere with the user program causing crashes and unexpected behavior. This happens as the instrumentation code preserves general purpose registers only. Build BOLT instrumentation runtime with "-mno-sse". Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128960	2022-07-01 15:29:36 -07:00
Alexander Yermolovich	e159abdb04	[BOLT][DWARF] Support mix mode DWARF Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D128232	2022-06-30 16:53:15 -07:00
Amir Ayupov	66b01a8934	[BOLT] Fix getDynoStats to handle BCs with no functions Address fuzzer crash Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D120696	2022-06-30 01:18:45 -07:00
Amir Ayupov	cb75faf40c	[X86][BOLT] Use getOperandType to determine memory access size Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc. This diff adds support for instructions that were previously reported as having memory access size 0. It replaces the heuristic of looking at instruction register width to determine memory access width by instead checking the memory operand type using tablegen-provided tables. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D126116	2022-06-30 00:25:32 -07:00
Amir Ayupov	798e92c6c4	[BOLT] Respect shouldPrint in dump-dot-all Don't dump dot CFG graph for functions that should not be printed. Reviewed By: rafauler, maksfb Differential Revision: https://reviews.llvm.org/D128699	2022-06-29 17:01:17 -07:00
Maksim Panchenko	ed74304506	[BOLT] Fix EH trampoline backout code When SplitFunctions pass adds a trampoline code for exception landing pads (limited to shared objects), it may increase the size of the hot fragment making it larger than the whole function pre-split. When this happens, the pass reverts the splitting action by restoring the original block order and marking all blocks hot. However, if createEHTrampolines() added new blocks to the CFG and modified invoke instructions, simply restoring the original block layout will not suffice as the new CFG has more blocks. For proper backout of the split, modify the original layout by merging in trampoline blocks immediately before their matching targets. As a result, the number of blocks increases, but the number of instructions and the function size remains the same as pre-split. Add an assertion for the number of blocks when updating a function layout. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128696	2022-06-29 14:35:57 -07:00
Fabian Parzefall	e341e9f094	[BOLT] Add option to randomize function split point For test purposes, we want to split functions at a random split point to be able to test different layouts without relying on the profile. This patch introduces an option, that randomly chooses a split point to partition blocks of a function into hot and cold regions. Reviewed By: Amir, yota9 Differential Revision: https://reviews.llvm.org/D128773	2022-06-29 13:02:05 -07:00
Rafael Auler	fc2d96c334	Revert "[BOLT][AArch64] Handle gold linker veneers" This reverts commit `425dda76e9`. This commit is currently causing BOLT to crash in one of our binaries and needs a bit more checking to make sure it is safe to land.	2022-06-28 19:23:28 -07:00
Vladislav Khmelevsky	425dda76e9	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D128082	2022-06-28 16:14:05 +03:00
Amir Ayupov	d58b5a0614	[BOLT] Restrict icp-inline to callsites ICP peel for inline mode only makes sense for calls, not jump tables. Plus, add a check that the Target BinaryFunction is found. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128404	2022-06-27 11:08:55 -07:00
Amir Ayupov	0d477f63b0	[BOLT][NFC] Add aliases for ICP flags - `indirect-call-promotion` -> `icp` - `indirect-call-promotion-mispredict-threshold` -> `icp-mp-threshold` - `indirect-call-promotion-use-mispredicts` -> `icp-use-mp` - `indirect-call-promotion-topn` -> `icp-topn` - `indirect-call-promotion-calls-topn` -> `icp-calls-topn` - `indirect-call-promotion-jump-tables-topn` -> `icp-jt-topn` - `icp-jump-table-targets` -> `icp-jt-targets` This also fixes an inconsistency in ICP flag names that some start with `indirect-call-promotion` while others start with `icp`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128375	2022-06-27 10:29:26 -07:00
Amir Ayupov	c4302e4fc2	[BOLT][NFC] Use llvm::less_first Follow the case of https://reviews.llvm.org/D126068 and simplify call sites with `llvm::less_first`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128242	2022-06-27 10:27:17 -07:00
Fabian Parzefall	96f6ec5090	[BOLT] Mark option values of --split-functions deprecated The SplitFunctions pass does not distinguish between various splitting modes anymore. This change updates the command line interface to reflect this behavior by deprecating values passed to the --split-function option. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128558	2022-06-24 17:01:13 -07:00
Alexander Yermolovich	11a8dd65ec	[BOLT][DWARF] Add support for DW_AT_call_pc/DW_AT_call_return_pc DWARF 5 added two new attributes DW_AT_call_pc and DW_AT_call_return_pc. Adding support for them. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D128526	2022-06-24 12:37:58 -07:00
Amir Ayupov	d2c8769936	[BOLT][NFC] Use range-based STL wrappers Replace `std::` algorithms taking begin/end iterators with `llvm::` counterparts accepting ranges. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128154	2022-06-23 22:16:27 -07:00
Maksim Panchenko	30a6d3ada6	[BOLT][TEST] Fix stack alignment in section-reloc-with-addend.s Misaligned stack can cause a runtime crash. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128227	2022-06-20 14:47:37 -07:00
Maksim Panchenko	f263a66ba0	[BOLT] Split functions with exceptions in shared objects and PIEs Add functionality to allow splitting code with C++ exceptions in shared libraries and PIEs. To overcome a limitation in exception ranges format, for functions with fragments spanning multiple sections, add trampoline landing pads in the same section as the corresponding throwing range. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127936	2022-06-19 16:48:48 -07:00
Amir Ayupov	445bc88501	[BOLT] Use 32-bit MOV to zero 64-bit register in instrumentation code Instead of `movabsq $0x0, %rax` emit shorter equivalent `movl $0x0, %eax`. Intel SDM, 3.4.1.1 General-Purpose Registers in 64-Bit Mode: >32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in > the destination general-purpose register. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D127045	2022-06-19 11:34:32 -07:00
Huan Nguyen	543f13c99b	[BOLT] Allow function entry to be a cold fragment Allow cold fragment to get new address. Our previous assumption is that a fragment (.cold) is only reached through the main fragment of same function. In addition, .cold fragment must be reached through either (a) direct transfer, or (b) split jump table. For (a), we perform a simple fix-up. For (b), we currently mark all relevant fragments as non-simple. Therefore, there is no need to get new address for .cold fragment. This is not always the case, as function entry can be rarely executed, and is placed in .text.cold segment. Essentially we cannot tell which the source-level function entry is based on hot and cold segments, so we must treat each fragment a function on its own. Therfore, we remove the assertion that a function entry cannot be cold fragment. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128111	2022-06-18 11:39:51 -07:00
Huan Nguyen	28b1dcb122	[BOLT] Allow function fragments to point to one jump table Resolve a crash related to split functions Due to split function optimization, a function can be divided to two  fragments, and both fragments can access same jump table. This violates  the assumption that a jump table can only have one parent function,  which causes a crash during instrumentation. We want to support the case: different functions cannot access same jump tables, but different fragments of same function can! As all fragments are from same function, we point JT::Parent to one specific fragment. Right now it is the first disassembled fragment, but we can point it to the function's main fragment later. Functions are disassembled sequentially. Previously, at the end of processing a function, JT::OffsetEntries is cleared, so other fragment can no longer reuse JT::OffsetEntries. To extend the support for split function, we only clear JT::OffsetEntries after all functions are disassembled. Let say A.hot and A.cold access JT of three targets {X, Y, Z}, where X and Y are in A.hot, and Z is in A.cold. Suppose that A.hot is disassembled first, JT::OffsetEntries = {X',Y',INVALID_OFFSET}. When A.cold is disassembled, it cannot reuse JT::OffsetEntries above due to different fragment start. A simple solution: A.hot = {X',Y',INVALID_OFFSET} A.cold = {INVALID_OFFSET, INVALID_OFFSET, INVALID_OFFSET} We update the assertion to allow different fragments of same function to get the same JumpTable object. Potential improvements: A.hot = {X',Y',INVALID_OFFSET} A.cold = {INVALID_OFFSET, INVALID_OFFSET, Z'} The main issue is A.hot and A.cold have separate CFGs, thus jump table targets are still constrained within fragment bounds. Future improvements: A.hot = {X, Y, Z} A.cold = {X, Y, Z} Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D127924	2022-06-17 16:22:30 -07:00

1 2 3 4 5 ...

1324 Commits