llvm-project

Commit Graph

Author	SHA1	Message	Date
Amir Ayupov	c844850bdf	[BOLT][NFC] Move out handleIndirectBranch Move the large lambda out of BinaryFunction::disassemble, reducing its size from 295 to 255 LoC. Differential Revision: https://reviews.llvm.org/D132101	2022-08-23 17:36:51 -07:00
Amir Ayupov	ec1fbf229e	[BOLT][NFC] Move out handleExternalReference Move the large lambda out of BinaryFunction::disassemble, reducing its size from 338 to 295 LoC. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132100	2022-08-23 17:36:41 -07:00
Amir Ayupov	6cd475f8ca	[BOLT][NFC] Move out handlePCRelOperand Move the large lambda out of BinaryFunction::disassemble, reducing its size from 377 to 338 LoC. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132099	2022-08-23 17:36:29 -07:00
Kazu Hirata	258531b7ac	Remove redundant initialization of Optional (NFC)	2022-08-20 21:18:28 -07:00
John Ericson	90dcdc4b6e	[bolt][llvm][cmake] Use `CMAKE_INSTALL_LIBDIR` too Working back towards D130586. Bolt didn't use `LLVM_LIBDIR_SUFFIX` before, and has no in-tree reverse dependencies, it seems easier to add. The change in LLVM itself is to prevent some unexpected `lib64` from cropping up due to the `CMAKE_INSTALL_LIBDIR` defaulting logic. Differential Revision: https://reviews.llvm.org/D132297	2022-08-20 13:08:06 -04:00
Alexander Yermolovich	928c2ba179	[DWARF][BOLT] Fix handling of converting range accesss from ofset to index. Wasn't handling correctly creating DW_AT_rnglists_base in UnitDie when converting access pattern for DW_AT_ranges from offset to index for DWARF5. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132087	2022-08-19 15:28:12 -07:00
Fabian Parzefall	e001a4e489	[BOLT] Insert EH trampolines for multiple fragments This patch adds exception handling trampolines when a function is split into more than two fragments. Trampolines are tracked per-fragment, such that they can be removed if splitting is reversed. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132048	2022-08-18 21:55:08 -07:00
Fabian Parzefall	ac830664b2	[BOLT] Update buildCallGraph to check for split blocks Use isSplit() instead of isCold() when building the call graph and update parameter names to reflect this. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132047	2022-08-18 21:55:08 -07:00
Fabian Parzefall	48ff38ce5d	[BOLT] Add randomN split strategy This adds a strategy to split functions into a random number of fragments at randomly chosen split points. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130647	2022-08-18 21:55:07 -07:00
Fabian Parzefall	f428db7a00	[BOLT] Add split all blocks strategy This adds a function splitting strategy that splits each outlineable basic block into its own fragment. This is exposed through a new command line option `--split-strategy`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D129827	2022-08-18 21:55:07 -07:00
Fabian Parzefall	0f74d191d1	[BOLT] Generate sections for multiple fragments This patch adds support to generate any number of sections that are assigned to fragments of functions that are split more than two-way. With this, a function's nth split fragment goes into section `.text.cold.n`. This also changes `FunctionLayout::erase` to make sure, that there are no empty fragments at the end of the function. This sometimes happens when blocks are erased from the function. To avoid creating symbols pointing to these fragments, they need to be removed. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130521	2022-08-18 21:55:06 -07:00
Fabian Parzefall	a191ea7d59	[BOLT] Make exception handling fragment aware This adds basic fragment awareness in the exception handling passes and generates the necessary symbols for fragments. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130520	2022-08-18 21:55:06 -07:00
Fabian Parzefall	275e075cbe	[BOLT] Support passing fragments to code emission This changes code emission such that it can emit specific function fragments instead of scanning all basic blocks of a function and just emitting those that are hot or cold. To implement this, `FunctionLayout` explicitly distinguishes the "main" fragment (i.e. the one that contains the entry block and is associated with the original symbol) from "split" fragments. Additionally, `BinaryFunction` receives support for multiple cold symbols - one for each split fragment. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130052	2022-08-18 21:55:06 -07:00
John Ericson	e941b031d3	Revert "[cmake] Use `CMAKE_INSTALL_LIBDIR` too" This reverts commit `f7a33090a9`. Unfortunately this causes a number of failures that didn't show up in my local build.	2022-08-18 22:46:32 -04:00
Amir Ayupov	129dfc8a9a	Revert "[BOLT][NFC] Simplify addRelocation" This reverts commit `29f2301322`. This change breaks one of the internal tests.	2022-08-18 17:26:26 -07:00
John Ericson	f7a33090a9	[cmake] Use `CMAKE_INSTALL_LIBDIR` too We held off on this before as `LLVM_LIBDIR_SUFFIX` conflicted with it. Now we return this. `LLVM_LIBDIR_SUFFIX` is kept as a deprecated way to set `CMAKE_INSTALL_LIBDIR`. The other `*_LIBDIR_SUFFIX` are just removed entirely. I imagine this is too potentially-breaking to make LLVM 15. That's fine. I have a more minimal version of this in the disto (NixOS) patches for LLVM 15 (like previous versions). This more expansive version I will test harder after the release is cut. Reviewed By: sebastian-ne, ldionne, #libc, #libc_abi Differential Revision: https://reviews.llvm.org/D130586	2022-08-18 15:33:35 -04:00
Denis Revunov	d0e29e87cd	[BOLT][AArch64] Ignore functions with islandsInfo during VeneerEliminarion and ICF Differential Revision: https://reviews.llvm.org/D131881 Reviewed By: yota9	2022-08-18 11:08:47 -04:00
Amir Ayupov	e33599371e	[BOLT][NFC] Reformat strings in handleRelocation With reduced indentation, some strings can be reformatted to take less lines. Also strategically apply `formatv` to shorten them. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132088	2022-08-17 20:45:18 -07:00
Amir Ayupov	70d0134f1d	[BOLT][NFC] Split out handleRelocation Split out the body of a for-loop in `RewriteInstance::readRelocations` into a separate function (`handleRelocation`). It's still over 300 lines of code, so it's worth splitting down further. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132078	2022-08-17 20:43:51 -07:00
Amir Ayupov	330eec139e	[BOLT][UTILS] Add nfc-check-setup --switch-back option Add an option to switch repo revision back, handling stashing automatically. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128243	2022-08-17 17:37:27 -07:00
Amir Ayupov	055f9f6d08	[BOLT][NFC] Simplify debug logging in case of JT heuristic failure Move logging into LLVM_DEBUG scope. Remove redundant printing of jump table parents: Old logging: ``` failed to analyze jump table in function _ZN12_GLOBAL__N_116InitHeaderSearch23Ad dDefaultCIncludePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1(2) PIC Jump table JUMP_TABLE/_ZN12_GLOBAL__N_116InitHeaderSearch23AddDefaultCInclud ePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1.1 for function _ZN12_GL OBAL__N_116InitHeaderSearch23AddDefaultCIncludePathsERKN4llvm6TripleERKN5clang19 HeaderSearchOptionsE/1(2) at 0x65996e0 with a total count of 0: 0x9dc next jump table at 0x659a810 belongs to function _ZN5clang5Lexer40LexDependencyD irectiveTokenWhileSkippingERNS_5TokenE PIC Jump table JUMP_TABLE/_ZN5clang5Lexer40LexDependencyDirectiveTokenWhileSkipp ingERNS_5TokenE.0 for function _ZN5clang5Lexer40LexDependencyDirectiveTokenWhile SkippingERNS_5TokenE at 0x659a810 with a total count of 0: jump table heuristic failure ``` New logging: ``` failed to analyze PIC Jump table JUMP_TABLE/_ZN12_GLOBAL__N_116InitHeaderSearch2 3AddDefaultCIncludePathsERKN4llvm6TripleERKN5clang19HeaderSearchOptionsE/1.1 for function _ZN12_GLOBAL__N_116InitHeaderSearch23AddDefaultCIncludePathsERKN4llvm6T ripleERKN5clang19HeaderSearchOptionsE/1(*2) at 0x65996e0 with a total count of 0: absolute offset: 0x52ac58c next PIC Jump table JUMP_TABLE/_ZN5clang5Lexer40LexDependencyDirectiveTokenWhile SkippingERNS_5TokenE.0 for function _ZN5clang5Lexer40LexDependencyDirectiveToken WhileSkippingERNS_5TokenE at 0x659a810 with a total count of 0: jump table heuristic failure ``` Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131243	2022-08-17 17:35:16 -07:00
Amir Ayupov	cdef841fe7	[BOLT][NFC] Simplify scanExternalRefs Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132013	2022-08-17 17:33:59 -07:00
Alexander Yermolovich	ccbf28b09d	[BOLT][DWARF] Handle zero size DW_TAG_inlined_subroutine We were resetting DW_AT_low_pc to zero when DW_AT_high_pc was zero, or DW_AT_low_pc == DW_AT_high_pc. This resulted in LLDB to print error "adding range [0x0-0x0) which has a base that is less than the function's low PC". Changed it so that when this case arises we set DW_AT_low_pc to the start address. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132059	2022-08-17 17:29:53 -07:00
Fabian Parzefall	fd159c2316	[BOLT] Fix ignored LP at fragment start If the first block of a fragment is also a landing pad, the landing pad is not used if an exception is thrown. This is because the landing pad is at the same start address that the corresponding LSDA describes. In that case, the offset in the call site records to refer to that landing pad is zero, and a zero offset is interpreted by the personality function as "no handler" and ignored. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D132053	2022-08-17 16:34:44 -07:00
Amir Ayupov	4ddc9c8e12	[BOLT][NFC] Move printRelocationInfo into a method Move this large lambda out of readRelocations into a standalone method. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131812	2022-08-17 16:28:33 -07:00
Amir Ayupov	29f2301322	[BOLT][NFC] Simplify addRelocation Move the implementation out of the header file. Simplify the method. Add debug logging. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131811	2022-08-17 16:07:28 -07:00
Alexander Yermolovich	b786e01f93	[DWARF][BOLT] Handle getBinaryFunctionContainingAddress returning nullptr for DW_TAG_call_site DW_TAG_call_site/DW_AT_call_return_pc can contain address that is not in any function. In this case getBinaryFunctionContainingAddress returns nullptr. For this case preserving original address. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D132057	2022-08-17 16:04:34 -07:00
Fabian Parzefall	aed75748de	[BOLT] Remove old layout from function layout To track whether a function's new layout is different from its old layout when updating it, the old layout would be kept around in memory indefinitely (if the new layout is different). This was used only for debugging/logging purposes. This patch forces the caller of function layout's update method to copy the old layout into a temporary if they need it by removing the old layout fields. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131413	2022-08-17 15:06:17 -07:00
Fabian Parzefall	0f8412c19c	[BOLT] Add main fragment to function layout Functions that do not contain any code still have to be emitted. This occurs on AArch64 where functions can consist only of a constant island. To support fragment semantics in code emission, this commits adds a guaranteed main fragment to function layout. This fragment might be empty, but allows us omit checks whether the function is empty in most places. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D130051	2022-08-17 14:51:31 -07:00
Amir Ayupov	556efdba85	[BOLT][NFC] Extend debug logging in analyzeJumpTable Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D131918	2022-08-15 20:34:40 -07:00
Kazu Hirata	2febc32c9c	Use llvm::erase_if (NFC)	2022-08-13 12:55:48 -07:00
Fangrui Song	53113515cd	[BOLT] Use Optional::emplace to avoid move assignment. NFC	2022-08-12 12:51:50 -07:00
Fangrui Song	0972a390b9	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2022-08-09 04:06:52 +00:00
Kazu Hirata	d3651aa697	[BOLT] Upgrade to C++17 Without this patch, I am getting errors like: llvm-project/llvm/include/llvm/ADT/StringRef.h:233:7: error: use of the 'nodiscard' attribute is a C++17 extension [-Werror,-Wc++17-extensions] Differential Revision: https://reviews.llvm.org/D131348	2022-08-07 23:12:16 -07:00
Thorsten Schütt	0c9258612b	[bolt] silence unused variables warnings	2022-08-06 20:52:45 +02:00
Kazu Hirata	c8e6ebd74e	Use value instead of getValue (NFC)	2022-08-06 11:21:39 -07:00
Tobias Hieta	b1356504e6	[LLVM] Update C++ standard to 17 Also make the soft toolchain requirements hard. This allows us to use C++17 features in LLVM now. If we find patterns with C++17 that improve readability it should be recommended in the coding standards. Reviewed By: jhenderson, cor3ntin, MaskRay Differential Revision: https://reviews.llvm.org/D130689	2022-08-06 09:42:10 +02:00
Rafael Auler	19eb908e61	[BOLT] Remove always true if statement Got a warning from GCC when building this. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D131092	2022-08-03 13:11:33 -07:00
Nicolai Hähnle	f7872cdce1	CommandLine: add and use cl::SubCommand::get{All,TopLevel} Prefer using these accessors to access the special sub-commands corresponding to the top-level (no subcommand) and all sub-commands. This is a preparatory step towards removing the use of ManagedStatic: with a subsequent change, these global instances will be moved to be regular function-scope statics. It is split up to give downstream projects a (albeit short) window in which they can switch to using the accessors in a forward-compatible way. Differential Revision: https://reviews.llvm.org/D129118	2022-08-02 23:49:16 +02:00
Sriraman Tallam	3e43d0cde7	This patch fixes these errors while building BOLT. Compiling llvm/llvm-project/bolt/include/bolt/Passes/RegReAssign.h failed: ...: error: invalid application of 'sizeof' to an incomplete type 'llvm::bolt::BinaryFunctionCallGraph' static_assert(sizeof(_Tp) >= 0, "cannot delete an incomplete type"); error: type 'llvm::bolt::BinaryBasicBlock *' cannot be used prior to '::' because it has no members using NodeRef = typename GraphType::UnknownGraphTypeError; BinaryDomTree.h:31:14: error: no template named 'DomTreeGraphTraitsBase' : public DomTreeGraphTraitsBase<bolt::BinaryDomTreeNode, Differential Revision: https://reviews.llvm.org/D130402	2022-08-02 11:23:37 -07:00
David Blaikie	7651522b78	Fold assert-used variable into assert Fixes #56724	2022-08-01 21:57:11 +00:00
Alexander Yermolovich	dd29b3c542	[BOLT][DWARF] Fix handling of multiple DW_OP_addrx in an expression We were not handling correclty multiple DW_OP_addrx in the location expression. This was exposed by clang-15 build in release mode with debug information. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130812	2022-08-01 14:38:47 -07:00
Kazu Hirata	bf6021709a	Use drop_begin (NFC)	2022-07-31 15:17:09 -07:00
Kazu Hirata	ce3b687b88	[BOLT] Remove redundaunt string initialization (NFC) Identified with readability-redundant-string-init.	2022-07-31 15:17:05 -07:00
Kazu Hirata	f24ddf6d41	[BOLT] Remove redundant const from return types (NFC) Identified with readability-const-return-type.	2022-07-31 15:17:03 -07:00
Kazu Hirata	1bf531a5d0	[BOLT] Use boolean literals (NFC) Identified with modernize-use-bool-literals.	2022-07-31 15:17:02 -07:00
Amir Ayupov	468d4f6d18	Revert "[BOLT] Ignore functions accessing false positive jump tables" This diff uncovers an ASAN leak in getOrCreateJumpTable: ``` Indirect leak of 264 byte(s) in 1 object(s) allocated from: #1 0x4f6e48c in llvm::bolt::BinaryContext::getOrCreateJumpTable ... ``` The removal of an assertion needs to be accompanied by proper deallocation of a `JumpTable` object for which `analyzeJumpTable` was unsuccessful. This reverts commit `52cd00cabf`.	2022-07-30 10:39:46 -07:00
Kazu Hirata	12b29900a1	Use any_of (NFC)	2022-07-30 10:35:56 -07:00
Kazu Hirata	f081ec20b5	[bolt] Remove redundaunt virtual specifiers (NFC) Identified with modernize-use-override.	2022-07-30 10:35:51 -07:00
Kazu Hirata	b498a8991e	[bolt] Remove redundaunt control-flow statements (NFC) Identified with readability-redundant-control-flow.	2022-07-30 10:35:49 -07:00
Kazu Hirata	60db8d9b4e	Use nullptr instead of 0 (NFC) Identified with modernize-use-nullptr.	2022-07-30 10:35:48 -07:00
Rafael Auler	fc0ced73dc	Add BAT testing framework This patch refactors BAT to be testable as a library, so we can have open-source tests on it. This further fixes an issue with basic blocks that lack a valid input offset, making BAT omit those when writing translation tables. Test Plan: new testcases added, new testing tool added (llvm-bat-dump) Differential Revision: https://reviews.llvm.org/D129382	2022-07-29 14:55:04 -07:00
Fangrui Song	7430894a65	Replace Optional::hasValue with has_value or operator bool. NFC	2022-07-29 10:57:25 -07:00
Fangrui Song	999514bb9a	[bolt] Replace Optional::getValue with value or operator*. NFC	2022-07-29 01:15:24 -07:00
Huan Nguyen	52cd00cabf	[BOLT] Ignore functions accessing false positive jump tables Disassembly and branch target analysis are not decoupled, so any analysis that depends on disassembly may not operate properly. In specific, analyzeJumpTable uses instruction bounds check property. A jump table was analyzed twice: (a) during disassembly, and (b) after disassembly, so there are potentially some mismatched results. In this update, functions that access JTs which fail the second check will be marked as ignored. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130431	2022-07-28 23:22:17 -07:00
Huan Nguyen	ccabbfff86	[BOLT] Remove --allow-stripped option AllowStripped has not been used in BOLT. This option is replaced by actively detecting stripped binary. Test Plan: Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130036	2022-07-28 23:15:53 -07:00
Huan Nguyen	986362d4a3	[BOLT] Add BinaryContext::IsStripped Determine stripped status of a binary based on .symtab Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130034	2022-07-28 23:11:03 -07:00
Simon Tatham	0db13e10c5	[bolt,AArch64] Fix one more test failure from D130358. This one actually makes the test simpler, because lit doesn't have to reconstitute a 32-bit little-endian value from individual bytes any more: llvm-objdump is printing the desired 32-bit value in the first place, so we can move straight on to doing the arithmetic on it.	2022-07-26 16:41:09 +01:00
Amir Ayupov	77c1977384	[BOLT] Support files with no symbols `LastSymbol` handling in `discoverFileObjects` assumes a non-zero number of symbols in an object file. It's not the case for broken_dynsym.test added in D130073, and potentially other stripped binaries. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130544	2022-07-26 00:07:59 -07:00
Amir Ayupov	79c2fe066d	[BOLT][TEST] Update fptr.test The test exercises an implicit ptr-to-int conversion which is made an error in D129881. We acknowledge the error but still want to test this case. Add `-Wno-int-conversion` to silence the error. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130546	2022-07-25 22:00:46 -07:00
Fabian Parzefall	83882606db	[BOLT] Process each block only once in fixCFGForPIC Rather than iterating over the whole function from the start until no internal calls are found, process each block only once and continue processing after splitting. This version of the function also does not seemingly invalidate iterators from within the loop. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D130436	2022-07-25 15:06:24 -07:00
Huan Nguyen	8eb68d92d4	[BOLT] Handle broken .dynsym in stripped binaries Strip tools cause a few symbols in .dynsym to have bad section index. This update safely keeps such broken symbols intact. Test Plan: ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130073	2022-07-22 11:24:09 -07:00
zr33	a2035c566f	[BOLT][DWARF] Fix bolt/test/X86/shared-abbrev.s There should not be a end of child mark before DW_AT_ranges, removed it and fixed unit offset. Reviewed By: ayermolo Differential Revision: https://reviews.llvm.org/D130335	2022-07-22 10:45:28 -07:00
Maksim Panchenko	661577b5f4	[BOLT] Add support for the latest perf tool The latest perf tool can return non-empty buffer when executing buildid-list command, even when perf.data was recorded with -B flag. Some binaries will be listed without the ID, while others may have a recorded ID. Allow invalid entires on the input, while checking the valid ones for the match. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D130223	2022-07-22 07:56:15 -07:00
John Ericson	07b749800c	[cmake] Don't export `LLVM_TOOLS_INSTALL_DIR` anymore First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as `CMAKE_INSTALL_BINDIR` becomes an absolute path, and then when downstream projects try to install there too this breaks because our builds always install to fresh directories for isolation's sake. Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the other specially crafted `LLVM_CONFIG_*` variables substituted in `llvm/cmake/modules/LLVMConfig.cmake.in`. @beanz added it in `d0e1c2a550` to fix a dangling reference in `AddLLVM`, but I am suspicious of how this variable doesn't follow the pattern. Those other ones are carefully made to be build-time vs install-time variables depending on which `LLVMConfig.cmake` is being generated, are carefully made relative as appropriate, etc. etc. For my NixOS use-case they are also fine because they are never used as downstream install variables, only for reading not writing. To avoid the problems I face, and restore symmetry, I deleted the exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s. `AddLLVM` now instead expects each project to define its own, and they do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports `LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in the usual way, matching the other remaining exported variables. For the `AddLLVM` changes, I tried to copy the existing pattern of internal vs non-internal or for LLVM vs for downstream function/macro names, but it would good to confirm I did that correctly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D117977	2022-07-21 19:04:00 +00:00
Sriraman Tallam	116ee23f4c	[bolt] std::atomic_uint64_t to std::atomic<uint64_t> Differential Revision: https://reviews.llvm.org/D129903	2022-07-19 16:09:11 -07:00
zr33	1a1324a303	[BOLT][DWARF] Fix incorrect DW_AT_type offset for unittest Some unit tests has incorrect DW_AT_type offset since they are manual crafted, fix them to the correct offset. Reviewed By: Amir, ayermolo Differential Revision: https://reviews.llvm.org/D129828	2022-07-18 14:20:22 -07:00
zr33	66a41e0807	[BOLT][DWARF] Add Unit test for DW_AT_high_pc [DW_FORM_addr] Reviewed By: ayermolo Differential Revision: https://reviews.llvm.org/D127613	2022-07-18 14:03:53 -07:00
Fabian Parzefall	8477bc6761	[BOLT] Add function layout class This patch adds a dedicated class to keep track of each function's layout. It also lays the groundwork for splitting functions into multiple fragments (as opposed to a strict hot/cold split). Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129518	2022-07-16 17:23:24 -07:00
Amir Ayupov	77b72fbc71	[BOLT][TEST] Add icp-inline.s test Add a test for `-icp-inline` knob, which ensures that ICP is only performed for functions that can be subsequently inlined. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D129803	2022-07-15 20:49:26 -07:00
Fangrui Song	d11ac9641b	[bolt] Include <atomic>	2022-07-15 14:27:01 -07:00
Huan Nguyen	ae563c9146	[BOLT] Support split landing pad We previously support split jump table, where some jump table entries target different fragments of same function. In this fix, we provide support for another type of intra-indirect transfer: landing pad. When C++ exception handling is used, compiler emits .gcc_except_table that describes the location of catch block (landing pad) for specific range that potentially invokes a throw(). Normally landing pads reside in the function, but with -fsplit-machine-functions, landing pads can be moved to another fragment. The intuition is, landing pads are rarely executed, so compiler can move them to .cold section. This update will mark all fragments that have landing pad to another fragment as non-simple, and later propagate non-simple to all related fragments. This update also includes one manual test case: split-landing-pad.s Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128561	2022-07-14 18:10:22 -07:00
Fabian Parzefall	d55dfeaf32	[BOLT] Replace uses of layout with basic block list As we are moving towards support for multiple fragments, loops that iterate over all basic blocks of a function, but do not depend on the order of basic blocks in the final layout, should iterate over binary functions directly, rather than the layout. Eventually, all loops using the layout list should either iterate over the function, or be aware of multiple layouts. This patch replaces references to binary function's block layout with the binary function itself where only little code changes are necessary. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129585	2022-07-14 13:07:05 -07:00
Huan Nguyen	05523dc32d	[BOLT] Support multiple parents for split jump table There are two assumptions regarding jump table: (a) It is accessed by only one fragment, say, Parent (b) All entries target instructions in Parent For (a), BOLT stores jump table entries as relative offset to Parent. For (b), BOLT treats jump table entries target somewhere out of Parent as INVALID_OFFSET, including fragment of same split function. In this update, we extend (a) and (b) to include fragment of same split functinon. For (a), we store jump table entries in absolute offset instead. In addition, jump table will store all fragments that access it. A fragment uses this information to only create label for jump table entries that target to that fragment. For (b), using absolute offset allows jump table entries to target fragments of same split function, i.e., extend support for split jump table. This can be done using relocation (fragment start/size) and fragment detection heuristics (e.g., using symbol name pattern for non-stripped binaries). For jump table targets that can only be reached by one fragment, we mark them as local label; otherwise, they would be the secondary function entry to the target fragment. Test Plan ``` ninja check-bolt ``` Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128474	2022-07-13 23:37:31 -07:00
Vladislav Khmelevsky	35efe1d806	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D129260	2022-07-13 14:47:22 +03:00
Denis Revunov	7564167885	[BOLT][AArch64] Use all supported CPU features on AArch64 Since we now have +all feature for AArch64 disassembler, we can use it in BOLT and allow it to disassemble all ARM instructions supported by LLVM. Reviewed by: rafauler Differential Revision: https://reviews.llvm.org/D129139	2022-07-12 03:56:04 -04:00
Rafael Auler	42a66fb727	[BOLT] Restrict execution of tests that fail on Windows Turn off execution of tests that use UNIX-specific features. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126933	2022-07-11 17:59:58 -07:00
Rafael Auler	a3cfdd746e	[BOLT] Increase coverage of shrink wrapping [5/5] Add -experimental-shrink-wrapping flag to control when we want to move callee-saved registers even when addresses of the stack frame are captured and used in pointer arithmetic, making it more challenging to do alias analysis to prove that we do not access optimized stack positions. This alias analysis is not yet implemented, hence, it is experimental. In practice, though, no compiler would emit code to do pointer arithmetic to access a saved callee-saved register unless there is a memory bug or we are failing to identify a callee-saved reg, so I'm not sure how useful it would be to formally prove that. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126115	2022-07-11 17:30:13 -07:00
Rafael Auler	3e5f67f356	[BOLT] Increase coverage of shrink wrapping [4/5] Change shrink-wrapping to try a priority list of save positions, instead of trying the best one and giving up if it doesn't work. This also increases coverage. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126114	2022-07-11 17:30:05 -07:00
Rafael Auler	3332904ad6	[BOLT] Increase coverage of shrink wrapping [3/5] Add the option to run -equalize-bb-counts before shrink wrapping to avoid unnecessarily optimizing some CFGs where profile is inaccurate but we can prove two blocks have the same frequency. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126113	2022-07-11 17:30:00 -07:00
Rafael Auler	3508ced6ea	[BOLT] Increase coverage of shrink wrapping [2/5] Refactor isStackAccess() to reflect updates by D126116. Now we only handle simple stack accesses and delegate the rest of the cases to getMemDataSize. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126112	2022-07-11 17:29:54 -07:00
Rafael Auler	42465efd17	[BOLT] Increase coverage of shrink wrapping [1/5] Change how function score is calculated and provide more detailed statistics when reporting back frame optimizer and shrink wrapping results. In this new statistics, we provide dynamic coverage numbers. The main metric for shrink wrapping is the number of executed stores that were saved because of shrink wrapping (push instructions that were either entirely moved away from the hot block or converted to a stack adjustment instruction). There is still a number of reduced load instructions (pop) that we are not counting at the moment. Also update alloc combiner to report dynamic numbers, as well as frame optimizer. For debugging purposes, we also include a list of top 10 functions optimized by shrink wrapping. These changes are aimed at better understanding the impact of shrink wrapping in a given binary. We also remove an assertion in dataflow analysis to do not choke on empty functions (which makes no sense). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126111	2022-07-11 17:29:22 -07:00
spupyrev	228970f612	Revert "Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations"" This reverts commit `76029cc53e`.	2022-07-11 09:50:47 -07:00
spupyrev	eecd41aa09	Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type" This reverts commit `6d0528636a`.	2022-07-11 09:50:47 -07:00
spupyrev	7228371054	[BOLT] Do not merge cold and hot chains of basic blocks There is a post-processing in ext-tsp block reordering that merges some blocks into chains. This allows to maintain the original block order in the absense of profile data and can be beneficial for code size (when fallthroughs are merged). In the earlier version we could merge hot and cold (with zero execution count) chains, that later were split by SplitFunction.cpp (when split-all-cold=1). The diff eliminates the redundant merging. It is unlikely the change will affect the performance of a binary in a measurable way, as it is mostly operates with cold basic blocks. However, after the diff the impact of split-all-cold is almost negligible and we can avoid the extra function splitting. Measuring on the clang binary (negative is good, positive is a regression): clang12 benchmark1: `0.0253` benchmark2: `-0.1843` benchmark3: `0.3234` benchmark4: `0.0333` clang10 benchmark1 `-0.2517` benchmark2 `-0.3703` benchmark3 `-0.1186` benchmark4 `-0.3822` clang7 benchmark1 `0.2526` benchmark2 `0.0500` benchmark3 `0.3024` benchmark4 `-0.0489` Overall: `-0.0671 ± 0.1172` (insignificant) Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D129397	2022-07-11 09:31:52 -07:00
Maksim Panchenko	76029cc53e	Rebase: [Facebook] Revert "[BOLT] Update dynamic relocations from section relocations" Summary: This reverts commit `729d29e167`. Needed as a workaround for T112872562. Manual rebase conflict history: https://phabricator.intern.facebook.com/D35230076 https://phabricator.intern.facebook.com/D35681740 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: spupyrev Differential Revision: https://phabricator.intern.facebook.com/D37098481	2022-07-11 09:31:52 -07:00
Rafael Auler	6d0528636a	Rebase: [Facebook] [MC] Introduce NeverAlign fragment type Summary: Introduce NeverAlign fragment type. The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary. In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion). This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place. The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle. LLVM: https://reviews.llvm.org/D97982 Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: phabricatorlinter Differential Revision: https://phabricator.intern.facebook.com/D31361547	2022-07-11 09:31:52 -07:00
Vladislav Khmelevsky	e10e120cea	[BOLT][Runtime] Fix memset definition Differential Revision: https://reviews.llvm.org/D129321	2022-07-09 01:17:08 +03:00
Michał Chojnowski	bd301a418b	[BOLT] Fix concurrent hash table modification in the instrumentation runtime `__bolt_instr_data_dump()` does not lock the hash tables when iterating over them, so the iteration can happen concurrently with a modification done in another thread, when the table is in an inconsistent state. This also has been observed in practice, when it caused a segmentation fault. We fix this by locking hash tables during iteration. This is done by taking the lock in `forEachElement()`. The only other site of iteration, `resetCounters()`, has been correctly locking the table even before this patch. This patch removes its `Lock` because the lock is now taken in the inner `forEachElement()`. Reviewed By: maksfb, yota9 Differential Revision: https://reviews.llvm.org/D129089	2022-07-07 14:27:29 +03:00
Maksim Panchenko	ea2182fedd	[BOLT] Add runtime functions required by freestanding environment Compiler can generate calls to some functions implicitly, even under constraints of freestanding environment. Make sure these functions are available in our runtime objects. Fixes test failures on some systems after https://reviews.llvm.org/D128960. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D129168	2022-07-06 11:22:22 -07:00
Elvina Yakubova	35155a0716	[BOLT] Change mutex implementation Changed acquire implemetaion to __atomic_test_and_set() and release to __atomic_clear() so it eliminates inline asm usage and is arch independent. Elvina Yakubova, Advanced Software Technology Lab, Huawei Reviewers: yota9, maksfb, rafauler Differential Revision: https://reviews.llvm.org/D129162	2022-07-06 08:19:54 +03:00
Maksim Panchenko	3a47037fcc	[BOLT] Fix instrumentation problem with floating point If BOLT instrumentation runtime uses XMM registers, it can interfere with the user program causing crashes and unexpected behavior. This happens as the instrumentation code preserves general purpose registers only. Build BOLT instrumentation runtime with "-mno-sse". Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D128960	2022-07-01 15:29:36 -07:00
Alexander Yermolovich	e159abdb04	[BOLT][DWARF] Support mix mode DWARF Added support for mixing monolithic DWARF5 with legacy DWARF, and monolithic legacy and DWARF5 split dwarf. Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D128232	2022-06-30 16:53:15 -07:00
Amir Ayupov	66b01a8934	[BOLT] Fix getDynoStats to handle BCs with no functions Address fuzzer crash Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D120696	2022-06-30 01:18:45 -07:00
Amir Ayupov	cb75faf40c	[X86][BOLT] Use getOperandType to determine memory access size Generate INSTRINFO_OPERAND_TYPE table in X86GenInstrInfo.inc. This diff adds support for instructions that were previously reported as having memory access size 0. It replaces the heuristic of looking at instruction register width to determine memory access width by instead checking the memory operand type using tablegen-provided tables. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D126116	2022-06-30 00:25:32 -07:00
Amir Ayupov	798e92c6c4	[BOLT] Respect shouldPrint in dump-dot-all Don't dump dot CFG graph for functions that should not be printed. Reviewed By: rafauler, maksfb Differential Revision: https://reviews.llvm.org/D128699	2022-06-29 17:01:17 -07:00
Maksim Panchenko	ed74304506	[BOLT] Fix EH trampoline backout code When SplitFunctions pass adds a trampoline code for exception landing pads (limited to shared objects), it may increase the size of the hot fragment making it larger than the whole function pre-split. When this happens, the pass reverts the splitting action by restoring the original block order and marking all blocks hot. However, if createEHTrampolines() added new blocks to the CFG and modified invoke instructions, simply restoring the original block layout will not suffice as the new CFG has more blocks. For proper backout of the split, modify the original layout by merging in trampoline blocks immediately before their matching targets. As a result, the number of blocks increases, but the number of instructions and the function size remains the same as pre-split. Add an assertion for the number of blocks when updating a function layout. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D128696	2022-06-29 14:35:57 -07:00
Fabian Parzefall	e341e9f094	[BOLT] Add option to randomize function split point For test purposes, we want to split functions at a random split point to be able to test different layouts without relying on the profile. This patch introduces an option, that randomly chooses a split point to partition blocks of a function into hot and cold regions. Reviewed By: Amir, yota9 Differential Revision: https://reviews.llvm.org/D128773	2022-06-29 13:02:05 -07:00
Rafael Auler	fc2d96c334	Revert "[BOLT][AArch64] Handle gold linker veneers" This reverts commit `425dda76e9`. This commit is currently causing BOLT to crash in one of our binaries and needs a bit more checking to make sure it is safe to land.	2022-06-28 19:23:28 -07:00
Vladislav Khmelevsky	425dda76e9	[BOLT][AArch64] Handle gold linker veneers The gold linker veneers are written between functions without symbols, so we to handle it specially in BOLT. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D128082	2022-06-28 16:14:05 +03:00

1 2 3 4 5 ...

1435 Commits