llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexander Yermolovich	612f0f4568	[BOLT][DWARF] Fix gdb index section Since we now re-write .debug_info the DWARF CU Offsets can change. Just like for .debug_aranges the GDB Index will need to be updated. Reviewed By: Amir, maksfb Differential Revision: https://reviews.llvm.org/D118273	2022-01-27 12:07:58 -08:00
Amir Ayupov	5c238be04b	[BOLT][TEST] Adjust tests for BOLT_CLANG_EXE=clang-{6..9} Fix tests to pass with clang-6..9 on Ubuntu 20.04. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D118282	2022-01-26 17:12:54 -08:00
Vladislav Khmelevsky	dcc595ea3c	[BOLT] Fix DWARFv5 for aarch64 This patch reverts patch "DWARFv5 default: Switch bolt tests to use DWARFv4 since Bolt doesn't support v5 yet" and places the -gdwarf-4 flag to the global cflags config file. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D118283	2022-01-27 02:14:58 +03:00
Vladislav Khmelevsky	20e9d4caf0	[BOLT] Prepare BOLT for unit-testing This patch adds unit testing support for BOLT. In order to do this we will need at least do this changes on the code level: * Make createMCPlusBuilder accessible externally * Remove positional InputFilename argument to bolt utlity sources And prepare the cmake and lit for the new tests. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Reviewed By: maksfb, Amir Differential Revision: https://reviews.llvm.org/D118271	2022-01-27 00:22:13 +03:00
David Blaikie	9407a70179	DWARFv5 default: Switch bolt tests to use DWARFv4 since Bolt doesn't support v5 yet Rough attempt to fix these, since I don't have bolt building locally. Will see how the buildbots go with it...	2022-01-24 15:09:35 -08:00
Vladislav Khmelevsky	bb8e7ebaad	[BOLT] Remove unreachable uncond branch after return This patch fixes the removal of unreachable uncondtional branch located after return instruction. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D117677	2022-01-19 22:06:26 +03:00
Amir Ayupov	5a4bf4c2b3	[BOLT][CMAKE] Use BOLT_CLANG_EXE and BOLT_LLD_EXE as is Add an ability to provide paths that don't match tool name exactly: e.g. clang-13. Remove use_lld call that sets up unused extra tools. Test plan: ``` cmake -G Ninja ../llvm-project/llvm -DLLVM_TARGETS_TO_BUILD="X86;AArch64" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="bolt" -DBOLT_CLANG_EXE=/usr/bin/clang-13 -DBOLT_LLD_EXE=/usr/bin/lld-13 ... llvm-lit: /data/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using clang: /usr/bin/clang-13 llvm-lit: /data/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using lld: /usr/bin/lld-13 cmake -G Ninja ../llvm-project/llvm -DLLVM_TARGETS_TO_BUILD="X86;AArch64" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="bolt;lld" -DBOLT_CLANG_EXE=/usr/bin/clang-13 ... llvm-lit: /data/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using clang: /usr/bin/clang-13 llvm-lit: /data/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using lld: /data/llvm-build2/bin/lld cmake -G Ninja ../llvm-project/llvm -DLLVM_TARGETS_TO_BUILD="X86;AArch64" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="bolt;clang;lld" ... llvm-lit: /data/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using clang: /data/llvm-build3/bin/clang llvm-lit: /data/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using lld: /data/llvm-build3/bin/lld ``` Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D117446	2022-01-18 21:14:00 -08:00
Vladislav Khmelevsky	ad4e26833f	updateDWARFObjectAddressRanges: nullify low pc In case the case the DW_AT_ranges tag already exists for the object the low pc values won't be updated and will be incorrect in after-bolt binaries. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D117216	2022-01-18 22:37:29 +03:00
Amir Aupov	90ada97f36	[BOLT][TEST] Update exceptions-instrumentation.test Matching an exact byte offset is fragile if a different version of compiler is used (e.g. distro clang). Resolves an issue with running with BOLT_CLANG_EXE + clang-12 Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D117440	2022-01-18 11:17:57 -08:00
Amir Ayupov	de3e3fcfa3	[BOLT][CMAKE] Accept BOLT_CLANG_EXE and BOLT_LLD_EXE Add CMake options to supply clang and lld binaries for use in check-bolt instead of requiring the build of clang and lld projects. Suggested by Mehdi Amini in https://lists.llvm.org/pipermail/llvm-dev/2021-December/154426.html Test Plan: ``` cmake -G Ninja ~/local/llvm-project/llvm \ -DLLVM_TARGETS_TO_BUILD="X86" \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_ENABLE_PROJECTS="bolt" \ -DBOLT_CLANG_EXE=~/local/bin/clang \ -DBOLT_LLD_EXE=~/local/bin/lld ninja check-bolt ... llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using clang: /home/aaupov/local/bin/clang llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using ld.lld: /home/aaupov/local/bin/ld.lld llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using lld-link: /home/aaupov/local/bin/lld-link llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using ld64.lld: /home/aaupov/local/bin/ld64.lld llvm-lit: /home/aaupov/local/llvm-project/llvm/utils/lit/lit/llvm/config.py:436: note: using wasm-ld: /home/aaupov/local/bin/wasm-ld ... ``` Tested all configurations: - LLVM_ENABLE_PROJECTS="bolt;clang;lld" + no BOLT_*_EXE - LLVM_ENABLE_PROJECTS="bolt;clang" + BOLT_LLD_EXE - LLVM_ENABLE_PROJECTS="bolt;lld" + BOLT_CLANG_EXE - LLVM_ENABLE_PROJECTS="bolt" + BOLT_CLANG_EXE + BOLT_LLD_EXE - LLVM_ENABLE_PROJECTS="bolt;clang;lld" + BOLT_CLANG_EXE + BOLT_LLD_EXE Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D117061	2022-01-15 04:37:29 -08:00
Amir Ayupov	2d97f0f2ef	[BOLT][TEST] Move exceptions-instrumentation.test to X86 The aarch64 instrumentation is currently unsupported so the test is failing. Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D117102	2022-01-12 09:25:12 -08:00
Rafael Auler	b392ec696b	Re-enable Windows build and fix issues Summary: Fix missing string header file inclusion and link_fdata find problem in lit tests. Change root-level tests to require linux. Re-enable Windows in our root CMakeLists.txt. (cherry picked from FBD33296290)	2021-12-23 05:59:35 -08:00
Rafael Auler	07d9e014ed	[BOLT] Don't use ld.lld in tests Summary: Addressing issue 270. (cherry picked from FBD33255608)	2021-12-21 07:36:35 -08:00
Vladislav Khmelevsky	08f56926c2	[BOLT] Move disassemble optimizations to optimization passes Summary: The patch moves the shortenInstructions and nop remove to separate binary passes. As a result when llvm-bolt optimizations stage will begin the instructions of the binary functions will be absolutely the same as it was in the binary. This is needed for the golang support by llvm-bolt. Some of the tests must be changed, since bb alignment nops might create unreachable BBs in original functions. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD32896517)	2021-12-18 17:03:35 -08:00
Rafael Auler	46e93fb427	Fix frameopt crash when processing POPF Summary: POPF instruction was triggering an assertion in our analysis. (cherry picked from FBD33141809)	2021-12-15 13:29:46 -08:00
Elvina Yakubova	4a4045f740	[PR] Fix update-debug-sections for AArch64 Summary: This patch adds AArch64 relocations handling in case updating of debug sections is enabled Elvina Yakubova, Advanced Software Technology Lab, Huawei (cherry picked from FBD33077609)	2021-12-08 16:53:38 +03:00
Amir Ayupov	6aa735ceaf	[BOLT] Split functions: support fragments with multiple parents Summary: Gracefully handle binaries with split functions where two fragments are folded into one, resulting in a fragment with two parent functions. This behavior is expected in GCC8+ with -O2 optimization level, where both function splitting and ICF are enabled by default. On the BOLT side, the changes are: - BinaryFunction: allow multiple parent fragments: - `ParentFragment` --> `ParentFragments`, - `setParentFragment` --> `addParentFragment`. - BinaryContext: - `populateJumpTables`: mark fragments to be skipped later, - `registerFragment`: add a name heuristic check, return false if it failed, - `processInterproceduralReferences`: check if `registerFragment` succeeded, otherwise issue a warning, - `skipMarkedFragments`: move out fragment traversal and skipping from `populateJumpTables` into a separate function. This change fixes an issue where unrelated functions might be registered as fragments: ``` BOLT-WARNING: interprocedural reference between unrelated fragments: bad_gs/1(2) and amd_decode_mce.cold.27/1(2) ``` (Linux kernel binary) (cherry picked from FBD32786688)	2021-12-01 21:14:56 -08:00
Maksim Panchenko	b73c87bc4f	[BOLT][DWARF] Force allocation of debug_line in RuntimeDyld Summary: Currently, RuntimeDyld will not allocate a section without relocations even if such a section is marked allocatable and defines symbols. When we emit .debug_line for compile units with unchanged code, we output original (input) data, without relocations. If all units are emitted in this way, we will have no relocations in the emitted .debug_line. RuntimeDyld will not allocate the section and as a result we will write an empty .debug_line section. To workaround the issue, always emit a relocation of RELOC_NONE type when emitting raw contents to debug_line. (cherry picked from FBD32909869)	2021-12-06 23:32:40 -08:00
Maksim Panchenko	cbf530bf41	[BOLT] Add pass to normalize CFG Summary: Some optimizations may remove all instructions in a basic block. The pass will cleanup the CFG afterwards by removing empty basic blocks and merging duplicate CFG edges. The normalized CFG is printed under '-print-normalized' option. (cherry picked from FBD32774360)	2021-12-01 13:57:50 -08:00
Amir Ayupov	fd71cc5163	[BOLT][TESTS] Move debugTypesBug.s test into binary tests Summary: Remove the test and its inputs. (cherry picked from FBD32855788)	2021-12-03 16:57:24 -08:00
Amir Ayupov	02145d20ab	[BOLT] Tail duplication: disable const/copy propagation by default as a workaround Summary: Disable const/copy propagation as a bug workaround. Also add the debug logging in aggressive duplication. (cherry picked from FBD32774744)	2021-12-01 14:05:05 -08:00
Amir Ayupov	76cd07f9e4	[BOLT] Tail Duplication: fix jump table check Summary: The intent is clearly to check the current basic block. (cherry picked from FBD32658103)	2021-11-24 15:39:24 -08:00
Amir Ayupov	7261655d2c	[BOLT] Tail Duplication: skip unreachable blocks Summary: TailDuplication::isInCacheLine makes the assumption that the block has a valid layout index, which is not the case for unreachable blocks. Add a check for a valid layout index. (cherry picked from FBD32659755)	2021-11-24 16:13:42 -08:00
Amir Ayupov	e9ee2ca1fa	[BOLT][TEST] Fix runtime/X86/retpoline-synthetic.test Summary: Restructure the test to prevent command echo from getting to check statements. (cherry picked from FBD32635888)	2021-11-23 20:33:50 -08:00
Vladislav Khmelevsky	a944a487ae	[PR] Fix ShrinkWrapping pop order Summary: The push and pop instructions might have wrong reorder due to this error. Thanks rafaelauler for the provided test case. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD32478348)	2021-11-14 02:23:20 +03:00
Rafael Auler	2f3285989e	[BOLT] Fix tailcall-traps and basic-instr tests on ubuntu Summary: These tests are failing on opensource ubuntu. (cherry picked from FBD32514489)	2021-11-17 15:38:36 -08:00
Amir Ayupov	c7f8adb87f	[BOLT][TEST] Add llvm-boltdiff to build/test requirements Summary: llvm-boltdiff is required for `runtime/meta-merge-fdata.test` (cherry picked from FBD32442220)	2021-11-15 13:53:18 -08:00
Amir Ayupov	1d0a276c72	[BOLT][TEST] Import small tests Summary: Imported small internal tests. (cherry picked from FBD32405870)	2021-11-12 15:38:45 -08:00
Amir Ayupov	7ea61dab03	[BOLT][TEST] Reduce vararg.test Summary: Reduce assembly inputs to vararg.test using CReduce (cherry picked from FBD32405869)	2021-11-11 20:05:09 -08:00
Amir Ayupov	0e7dd1aad1	[BOLT][TEST] Import small tests Summary: Imported small internal tests. (cherry picked from FBD32371964)	2021-11-11 14:28:46 -08:00
Amir Ayupov	3a16f2169d	[BOLT][TEST] Import jump-table-icp.test, update link_fdata script Summary: Import the test. The assembly input has three functions with associated fdata. The old link_fdata.sh script only replaces the symbol names with symbol values, whereas fdata format expects to have symbol offsets against the anchor symbol. Introduce the link_fdata.py script which is able to parse the input and produce either an offset or an absolute symbol value. (cherry picked from FBD32256351)	2021-11-08 10:56:21 -08:00
Amir Ayupov	8331f75e28	[BOLT][TEST] Rename tests to follow standard naming scheme Summary: The majority of tests in LLVM projects are using - instead of _ in the name, i.e. `check-something.test` is preferred over `check_something.test`. It makes sense for us to adopt the same naming scheme for our future tests and to rename existing ones. (cherry picked from FBD32185879)	2021-11-04 13:36:15 -07:00
Amir Ayupov	2e0ad6ffe4	[BOLT][TEST] Import small tests Summary: Imported small internal tests: - fallthrough-to-noop.test (cherry picked from FBD32158100)	2021-11-03 17:09:49 -07:00
Amir Ayupov	d1df113e30	[BOLT][TEST] Add instrumentation test using merge-fdata Summary: BOLT meta test using merge-fdata tool. This tests BOLT instrumentation for a non-trivial binary, running instrumented binary, and using the instrumentation profile for BOLT optimizations. The results are verified between original, instrumented, and optimized binaries. Additional tested features: boltdiff mode and merge-fdata for two profiles. merge-fdata tool is linked with relocs on Linux to support this test. (cherry picked from FBD32141812)	2021-11-03 10:41:26 -07:00
Amir Ayupov	f808ea00bd	[BOLT][TEST] Import small tests Summary: Imported small internal tests: - asm_func_debug.test - basic_instrumentation.test - bolt_icf.test - ctc_and_unreachable.test - double_jump.test - exceptions_args.test - exceptions_instrumentation.test - fptr.test (cherry picked from FBD32032684)	2021-10-29 13:31:22 -07:00
Rafael Auler	443f1b4ff4	Rebase: [BOLT] AsmDump: dump function assembly and profile info Summary: Added new functionality of dumping simple functions into assembly. This includes: - function control flow (basic blocks, instructions), - profile information as `FDATA` directives, to be consumed by link_fdata, - data labels, - CFI directives, - symbols for callee functions, - jump table symbols. Envisioned usage: 1. Find a function that triggers BOLT crash (e.g. with `bughunter.sh`). 2. Generate reproducer asm source for that function (using `-funcs`). 3. Attach it to an issue. 4. Reduce and include as a test case. Current limitations: 1. Emitted assembly won't match input file relocations. 2. No DWARF support. 3. Data is not emitted. (cherry picked from FBD32746857)	2021-09-27 10:51:25 -07:00
Rafael Auler	0559dab546	[BOLT] Improve cmake configs for opensource Summary: Change cmake config in BOLT to only support Linux. In other platforms, we print a warning that we won't build BOLT. Change configs to determine whether we will build BOLT runtime libs. This only happens in x86 hosts. If true, we will build the runtime and enable bolt-runtime tests. New tests that depend on the bolt_rt lib needs to be marked REQUIRES:bolt-runtime. I updated the relevant tests. Fix cmake to do not crash when building llvm with a target that BOLT does not support. (cherry picked from FBD31935760)	2021-10-26 12:26:23 -07:00
Elvina Yakubova	53ec21e3a1	[PR][BOLT][TEST] Fix tests Summary: Add lit.local.cfg to X86 and AArch64 folders. Fix host_arch in lit config for AArch64. Fix AArch64 and X86 tests. Elvina Yakubova, Advanced Software Technology Lab, Huawei (cherry picked from FBD31702068)	2021-10-11 11:15:08 +03:00
Amir Ayupov	01a81dca41	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - shared_object.test - shrinkwrapping.test - static_exe.test - tailcall.test - vararg.test (cherry picked from FBD31523478)	2021-10-08 18:23:32 -07:00
Amir Ayupov	44e08ead30	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - sctc_bug{,2,3,4}.test (cherry picked from FBD31517120)	2021-10-08 14:49:23 -07:00
Amir Ayupov	f44e1df9d0	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - re-optimize.test - relaxed_tailcall.test - remove_unused.test - retpoline_synthetic.test (cherry picked from FBD31516680)	2021-10-08 14:33:33 -07:00
Amir Ayupov	872013e077	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - cfi_instrs_reordered.s - no_entry_reordering.test - no_relocs.test - pie.test (cherry picked from FBD31514823)	2021-10-08 13:39:24 -07:00
Amir Ayupov	d41b4e6e2d	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - keep_aranges.test - layout_heuristic.test - line_number.test - block_reordering.test - branch_data.test - reader.test (cherry picked from FBD31486371)	2021-10-07 13:38:58 -07:00
Amir Ayupov	c74e5bfee3	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - jmp_optimization.test - jmpjmp.test - jump_table_footprint_reduction.test - jump_table_reference.test (cherry picked from FBD31483122)	2021-10-06 16:20:00 -07:00
Amir Ayupov	92e306de0c	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - indirect_goto.test - indirect_goto_pie.test - inlined_function_mixed.test (cherry picked from FBD31446571)	2021-10-06 12:23:05 -07:00
Vladislav Khmelevsky	5f953277a9	[PR] Handle relocations in constant islands Summary: In non-PIC binaries compiler could save absolute addresses in constant isalnd which we should handle properly. This patch adds relocations handling in constant islands. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD31416848)	2021-10-04 19:05:18 +03:00
Amir Ayupov	8ab49cb4aa	[BOLT] link_fdata: accept symbols with slash in the name Summary: Change sed separator to allow replacing symbols with slash in the name. This is required for symbol names produced by BOLT which include "/1" suffix. (cherry picked from FBD31324540)	2021-09-30 16:11:09 -07:00
Amir Ayupov	b86c91eae0	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - invalid_profile.test - internal_call.test - internal_call_instrument.test (cherry picked from FBD31452386)	2021-10-06 14:25:29 -07:00
Vladislav Khmelevsky	e424d16f0e	[PR] AArch64: Add TSTBR14 and CONDB19 relocations support Summary: This patch adds R_AARCH64_TSTBR14 and R_AARCH64_CONDBR19 relocations support in order to handle condition branches, cbz/cnbz and tbz/tbnz instructions correctly Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD31416734)	2021-10-03 13:41:41 +03:00
Vladislav Khmelevsky	848f07792c	[PR] Update skipRelocationProcess Summary: The ELF::R_AARCH64_TLSDESC_LD64_LO12 and ELF::R_AARCH64_TLSDESC_ADR_PAGE21 relocations might also be relaxed to mov instructions, handle these cases Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD31353063)	2021-10-01 22:06:15 +03:00
Maksim Panchenko	8ef3b27834	[BOLT][DWARF] Properly emit of end-of-sequence entries for line tables Summary: When the compiler emits line table program, it emits EOS using the label at the end of the containing code section. Since each compilation unit has its own set of code sections it works as expected (* see the excerpt from the standard below). However, in BOLT the code from many CUs is combined into a common section, such as hot text or cold text. As a result, the symbol at the end of the section may point way past the code sequence for a given unit. Since we can emit functions in any order, we conservatively emit end-of-sequence at the end of every emitted function. Fixes a problem while intermixing source code with disassembly in binutils' objdump. (*) DWARF v4 6.2.5.3: "Every line number program sequence must end with a DW_LNE_end_sequence instruction which creates a row whose address is that of the byte after the last target machine instruction of the sequence." (cherry picked from FBD31347870)	2021-09-30 17:47:50 -07:00
Amir Ayupov	e903671bbf	[BOLT][TEST] Imported small tests, removed duplicate input Summary: Imported small internal tests. - call_zero.s - cfi_expr_rewrite.s - cfi_insts_count.s - exceptions_pic.test - exceptions_run.test Removed duplicate input file (switch_statement.cpp) (cherry picked from FBD31355466)	2021-10-01 15:35:43 -07:00
Amir Ayupov	47455e98b3	[BOLT][TEST] Imported small tests Summary: Imported small internal tests: - R_X86_64_64.pic.lld.cpp - avx512_trap.test - bad_exe.test - bolt_info.test (cherry picked from FBD31251439)	2021-09-28 15:47:51 -07:00
Amir Ayupov	4157682fd9	[BOLT][TEST] Import internal_call_instrument.s Summary: Imported standalone assembly test (cherry picked from FBD31161181)	2021-09-23 14:28:13 -07:00
Amir Ayupov	6b4eb0b94a	[BOLT][TEST] Split runtime tests into test/runtime folder Summary: Create bolt/test/runtime folder and move tests that execute the binary. Move lit.local.cfg with host_arch check to the corresponding folder. Addresses issue facebookincubator/BOLT#132. AArch64/tls.c shows a different behavior with clang hence marked as XFAIL TODO: add a check for non-exec tests for a corresponding LLVM_TARGETS_TO_BUILD. (cherry picked from FBD31132234)	2021-09-22 17:58:33 -07:00
Vladislav Khmelevsky	e1da1539e3	[PR] Add AARCH64_MOVW_UABS_G* relocations support Summary: This patch fixes issue facebookincubator/BOLT#177 Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD31130162)	2021-09-23 00:52:36 +03:00
Amir Ayupov	d4fdc98140	[BOLT][TEST] Remove dependence on host_cc and host_cxx Summary: Add dependency on clang and clangxx instead. (cherry picked from FBD31128140)	2021-09-22 15:53:38 -07:00
Vladislav Khmelevsky	542c03c3a3	[PR] Fix aarch64 TLS relocations handling Summary: There are few problems found when dealing with TLS relocations for aarch64. * RewriteInstance.cpp While analyzing TLS relocation we don't have to modify SymbolAddress (which is the offset from the TLS section), so we need to just skip verifiction The non-got related TLS relocations on aarch64 might be skipped too ** The forse relocation must be applied for GOT relocations on Aarch64. The symbol adress for GOT relocation might no be pointing on GOT section (for example ADRP GOT may point to the wrong section, since GOT table is not page-aligned), so we won't try to get section by the symbol address. * Relocation.cpp - Remove R_AARCH64_TLSLE_ADD_TPREL_HI12 and R_AARCH64_TLSLE_ADD_TPREL_LO12_NC from isGOT check, since they are not got-related relocations * BinaryFunction.h Remove R_AARCH64_TLSLE_ADD_TPREL_HI12 and R_AARCH64_TLSLE_ADD_TPREL_LO12_NC from adding to relocation list, since this is actually an offset in TLS section and BOLT does not change it we don't need to do something with this relocations, the value won't change in new binary files Refactor the code, separating aarch64 and x86 relocations * AArch64MCPlusBuilder.cpp ** Add forgotten LO12 relocations to switch case to getTargetExprFor Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD31003349)	2021-09-02 21:04:33 +03:00
Maksim Panchenko	48fbeb1a46	[BOLT] Fix warnings from LLVM DWARF reading library Summary: LLVM started printing warnings when DWARFDebugInfoEntry::extractFast() is invoked trying to read a DIE past the current unit limits. This results in verbose warnings from BOLT which are harmless but confusing to the user. Check the boundaries before calling the API above. (cherry picked from FBD31097271)	2021-09-21 15:39:35 -07:00
Rafael Auler	7b779f819f	[BOLT] Fix binary corruption in non-reloc mode Summary: We have a problem where we will emit sections that we are not supposed to emit (with no output offset assigned). This will make us write at file offset 0 and corrupt the first sections in the binary (usually .interp section will be corrupted and bash will refuse to run the binary). This only happens in non-reloc mode when using JTS_BASIC and when we do not emit a function that has a jump table (if it gets too large). Using -update-debug-sections will trigger the pass check-large-functions, which will mark large funcs as non-simple and will hide this bug. (cherry picked from FBD30882012)	2021-09-10 16:19:50 -07:00
Vasily Leonenko	9aa134dc2d	[PR] Instrumentation: use TryLock for SimpleHashTable getter Summary: This commit introduces TryLock usage for SimpleHashTable getter to avoid deadlock and relax syscalls usage which causes significant overhead in runtime. The old behavior left under -conservative-instrumentation option passed to instrumentation library. Also, this commit includes a corresponding test case: instrumentation of executable which performs indirect calls from common code and signal handler. Note: in case if TryLock was failed to acquire the lock - this indirect call will not be accounted in the resulting profile. Vasily Leonenko, Advanced Software Technology Lab, Huawei (cherry picked from FBD30821949)	2021-08-08 04:50:06 +08:00
Vasily Leonenko	e2480fcc98	[PR] LIT: add checking if maxIndividualTestTime is availabe on the platform Summary: This commit adds checking if maxIndividualTestTime is availabe on the platform. If available - it sets per test timeout to 60sec and declares lit-max-individual-test-time feature for further checking by particular test cases. Based on https://reviews.llvm.org/D64251 implementation. Vasily Leonenko, Advanced Software Technology Lab, Huawei (cherry picked from FBD30821986)	2021-08-27 21:56:24 +03:00
Joey Thaman	3e8af67a95	[BOLT] Optimize the three way branch Summary: Three way branches commonly appear in HHVM. They have one test and then two jumps. The jump's destinations are not currently optimized. This pass attempts to optimize which is the first branch. (cherry picked from FBD30460441)	2021-08-17 10:15:21 -07:00
Vladislav Khmelevsky	c040431fe6	[PR] AArch64: Fix ADR instruction handling Summary: There are 2 problems found when handling ADR instruction: 1. When extracting value from the ADR instruction we need to do it another way, then we do it for ADRP instruction. 2. When creating target expression the VariantKind should be other for ADR instruction. And we introduces R_AARCH64_ADR_PREL_LO21, R_AARCH64_TLSDESC_ADR_PREL21 and R_AARCH64_ADR_PREL_PG_HI21_NC relocations support. Also this patch introduces AdrPass, which will replace non-local pointing ADR instructions with ADRP + ADD instructions sequence due to small offset range of ADR instruction, so after BOLT magic there are no guarantees that ADR instruction will still be in the range of just +- 1MB from its target. The instruction replacement needs relocations to be avalailable, so we won't remove "IsFromCode" relocations after disassembly from BF anymore. Also we need original offset of ADR instruction to be available so we add offset annotation for these instructions. The last thing this patch adds is ARM testing directory, which will be used only on ARM testing servers. The common tests (non-assembler tests which are platform-independent) might be moved from the X86 directory to the parent one in the future, so such tests could be tested on both X86 and ARM machines. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD30497379)	2021-08-20 03:07:01 +03:00
Joey Thaman	ef6186c822	[BOLT] Added Constant and Copy Propagation to tail duplicated blocks Summary: Added a function in TailDuplication that will do Constant and Copy Propagation for blocks that we duplicated as a part of tail duplication. Added supporting functions to MCPlusBuilder to find src registers and replace registers (cherry picked from FBD30231907)	2021-08-10 10:02:32 -07:00
Vladislav Khmelevsky	2a5790b670	[PR] Fdata: Escape whitespaces in symbol names Summary: This patch is part of preparation for golang support. The golang symbols might have spaces in the name (for example "type..eq.[10]interface {}"). Since fdata uses spaces as a field separator such names brakes the fdata format, so we need to escape whitespaces and backslashes in symbol names using the backslash character. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD29999491)	2021-06-29 19:54:08 +03:00
Rafael Auler	faee814fb9	Fix NFC tests Summary: Our NFC tests are failing on debug-fission-single.s. Fix the test to be compliant with our checking script. (cherry picked from FBD30352415)	2021-08-16 11:33:20 -07:00
Rafael Auler	d217e2f338	Rebase: [BOLT] DWP output support Summary: Added support for writing out DWP file. Works with regular dwo as input or DWP as input. (cherry picked from FBD31361619)	2021-06-29 15:28:52 -07:00
Vasily Leonenko	900914d3c6	[PR] Tests: add instrumentation tests for PIE exec & shared libs Summary: This commit adds dummy tests for checking instrumentation support for PIE executables and shared libraries. Vasily Leonenko, Advanced Software Technology Lab, Huawei (cherry picked from FBD30092729)	2021-06-19 23:01:28 +08:00
Maksim Panchenko	89a2e16037	[BOLT] Support PLT sections with variable entry sizes Summary: The linker can generate 8- or 16-byte entries in .plt.got and .plt.sec sections. On X86, the main differentiator is the presence of endbr64 instruction at the beginning of the entry. Detect the instruction and adjust the size accordingly. (cherry picked from FBD29847639)	2021-07-14 01:35:34 -07:00
Joey Thaman	a7e2a8f946	[BOLT] Tail Duplication active pass Summary: Amended the Tail Duplication analysis pass to do the tail duplication in question (cherry picked from FBD29833794)	2021-07-16 11:45:44 -07:00
Joey Thaman	2f46660559	[BOLT] Tail duplication analysis pass Summary: Created a binary pass that records how many times tail duplication would be used and how many cache misses it would theoretically stop (cherry picked from FBD29619858)	2021-07-01 07:11:26 -07:00
Maksim Panchenko	c9f5f47b51	[BOLT] Add support for .plt.sec and refactor PLT-reading code Summary: A binary can contain multiple PLT sections with different name and attributes (such as an entry size). Extend the support to .plt.sec and refactor the code to make future extensions simpler. (cherry picked from FBD29502107)	2021-06-30 14:41:41 -07:00
Maksim Panchenko	3e5ce1f282	[BOLT][TESTS] Remove dynamic relocations from YAML tests Summary: Our YAML objects contain references to dynamic relocations via .dynamic, but there are no corresponding relocation sections. Change .dynamic contents to specify no dynamic relocations. (cherry picked from FBD29502108)	2021-06-30 14:33:59 -07:00
Maksim Panchenko	f46af9e9bc	[BOLT][TESTS] Fix ICF test case Summary: Host compiler may generate duplicate functions and as a result BOLT can fold more than 1 function. (cherry picked from FBD29347302)	2021-06-23 16:13:30 -07:00
Maksim Panchenko	bbbd159ccb	[BOLT] Fix undefined symbol warnings/errors Summary: When we fold a function in relocation mode, make sure to clear its state to avoid emitting relocations against undefined symbols. (cherry picked from FBD29245320)	2021-06-18 14:35:39 -07:00
Maksim Panchenko	7bccf8d25d	[BOLT][NFC] Fix debug info printouts for inlined functions Summary: While printing debug info for instructions, we should use line tables from the corresponding DWARF CU which could be different from the containing function CU in case of inlined instructions. (cherry picked from FBD28908324)	2021-06-04 12:31:31 -07:00
Amir Ayupov	65d227c035	[BOLT][TEST] Fix test case to conform to analyzePICJumpTable pattern matching Summary: Make sure that jump table is properly recognized in `split_func_jump_table_fragment.s`. (cherry picked from FBD28839976)	2021-06-02 10:50:47 -07:00
Vladislav Khmelevsky	79807d99fe	[PR] Introduce loop inversion pass Summary: This patch introduces LoopInversionPass. Its main purpose is to ensure that the loop layout is optimal depending on the profile information. So if profile information shows that the loop is used, the unconditional jump instruction must be executed only once and vice-versa. Please take a look to the pass header file and test for more details. Also change link_fdata script a bit, to be able to change FDATA prefix, like FileCheck does. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei PR facebookincubator/BOLT#153 (cherry picked from FBD28391811)	2021-05-11 20:59:13 +03:00
Amir Ayupov	12e9fec697	Rebase: [BOLT] DebugFission Support Summary: Implemented support for Debug Fission. For the most part it doesn't impact Monolithic execution path. One area that was changed is the DW_AT_low_pc/DW_AT_high_pc conversion. Before it was to DW_AT_ranges/DW_AT_low_pc, now DW_AT_low_pc is kept in same place. Another more visible impact is in Skeleton CU the DW_AT_low_pc is replaced with DW_AT_ranges_base if it's not originally present and bolt converted ranges conversion inside the dwo units. Output of this are multiple .dwo files with updated debug information. (cherry picked from FBD29569788)	2021-04-01 11:43:00 -07:00
Amir Ayupov	99d7f90635	[BOLT][NFC][TEST] Added llvm-dwarfdump and llvm-mc to BOLT_TEST_DEPS (cherry picked from FBD28427352)	2021-05-13 15:36:43 -07:00
Vladislav Khmelevsky	de298c08fd	[PR] Fix tests build with -no-pie option Summary: Since gcc/ld could produce and expect PIE files we need to pass -no-pie option to avoid linking errors for tests. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD28360045)	2021-05-11 03:25:49 +03:00
Alexey Moksyakov	ce84e9607a	[PR] Fix bb reordering optimization Summary: Reorder-blocks optimization pass doesn't take into account that available offset for legacy Jcc instructions (for example, JRCXZ - operand 8 bits) has to be less than 255 bytes. It's rare case and to exclude such functions with unsupported instructions from optimization passes added extra checking Alexey Moksyakov Advanced Software Technology Lab, Huawei (cherry picked from FBD28264117)	2021-04-23 11:34:40 +03:00
Amir Ayupov	94653797f3	Rebase: [BOLT][NFC] Avoid binutils in tests Summary: Replace binutils tools with llvm tools (cherry picked from FBD29575630)	2021-05-04 16:45:28 -07:00
Amir Ayupov	081e39aa15	Rebase: [cherry-pick] [BOLT] Add option to skip writing an output file Summary: The user may wish to run BOLT for printing statistics only (i.e. to check that the profile is valid). Add an option to run BOLT without writing any output file, similar to a dry run. This option is triggered by supplying -o with "/dev/null". (cherry picked from FBD29568632)	2021-03-29 16:04:57 -07:00
Maksim Panchenko	e7169be93f	[BOLT] Do not assert on jump table heuristic failure Summary: During the initial indirect jump analysis, we used to assert that the discovered jump table type matched the pattern of the corresponding instruction sequence. E.g., for PIC jump table memory we expected the PIC jump table instruction sequence. The assertions were too conservative, as in the case of a mismatch we can mark the indirect jump as having an unknown control flow. That should be sufficient to either skip the function processing or rely on relocation information for possible recovery of the control flow. (cherry picked from FBD27255816)	2021-03-23 13:41:41 -07:00
Rafael Auler	b3c34d568a	[BOLT] Fix instrumentation bug in duplicated JTs Summary: Fix a bug with instrumentation when trying to instrument functions that share a jump table with multiple indirect jumps. Usually, each indirect jump that uses a JT will have its own copy of it. When this does not happen, we need to duplicate the jump table safely, so we can split the edges correctly (each copy of the jump table may have different split edges). For this to happen, we need to correctly match the sequence of instructions that perform the indirect jump to identify the base address of the jump table and patch it to point to the new cloned JT. It was reported to us a case in which the compiler generated suboptimal code to do an indirect jump which our matcher failed to identify. Fixes facebookincubator/BOLT#126 (cherry picked from FBD27065579)	2021-03-15 16:34:25 -07:00
Maksim Panchenko	b11c826889	[BOLT] Fix false references to zero-sized objects Summary: Whenever BOLT encounters a data reference in code, it tries to convert it into <Object+Offset> form. The primary reason behind this approach is to support read-only data-reordering optimization. However, with the current level of the linker and compiler support we don't have enough information to always correctly restore the original <Object+Offset>. E.g. with zero-sized symbols we have to speculate that the actual size of the underlying object extends to the next symbol. Most of the time, there will be an object pointed by a zero-sized symbol and even if we are guessing incorrectly, there will be no harm in creating references of such form. The problem happens when there's no object corresponding to the original symbol and the next object is an (unmarked) jump table: A: # <- zero-sized object .LJUMP_TABLE: .long <entry1> .long <entry2> .... .LB: .long 21 .LC: .long 42 The jump table will be moved and all references past it (up to the next named object) will be incorrectly updated. We should not speculate about the size of A in a case like that and treat all discovered data objects (and thus references) independently. (cherry picked from FBD27005660)	2021-03-15 12:06:56 -07:00
Alexander Yermolovich	06959eedcf	Fix up test for Update DW_AT_stmt_list for .debug_types Summary: As titled. (cherry picked from FBD28112186)	2021-03-17 17:08:26 -07:00
Alexander Yermolovich	0ec91a25df	Update DW_AT_stmt_list for .debug_types Summary: There is no real link between CU and TU, so relying on fact that address are the same, and we are updating all of them. (cherry picked from FBD28112114)	2021-02-17 15:30:10 -08:00
Amir Ayupov	1c5d3a056c	Rebase: Merge BOLT codebase in monorepo Summary: This commit is the first step in rebasing all of BOLT history in the LLVM monorepo. It also solves trivial build issues by updating BOLT codebase to use current LLVM. There is still work left in rebasing some BOLT features and in making sure everything is working as intended. History has been rewritten to put BOLT in the /bolt folder, as opposed to /tools/llvm-bolt. (cherry picked from FBD33289252)	2020-12-01 16:29:39 -08:00
Rafael Auler	e0261a22ce	[TEST] Remove dependency on debug output Summary: Test mistakenly used -debug output, which makes it fail on no-asserts build. (cherry picked from FBD25399449)	2020-12-09 12:25:58 -08:00
Rafael Auler	d2f68039bc	[BOLT] Fix shrinkwrapping bug when changing frame alignment Summary: This fixes a bug with shrink wrapping when trying to move push-pops in a function where we are not allowed to modify the stack layout for alignment reasons. In this bug, we failed to propagate alignment requirement upwards in the call graph from function A to B when: (1) there is a cycle in the call graph and (2) the distance from A to B is greater than 1 in the call graph and (3) there is a node in the path from A to B, not including A or B, that does not access parameters in the stack. (cherry picked from FBD25315977)	2020-12-03 20:09:32 -08:00
Amir Ayupov	f9d00d418b	[BOLT] Handle insertion of updated CFI at the first basic block Summary: Fix corner case of insertion of updated CFI with unset `PrevBB`. Handle it in the same way as inserting past hot-cold split point. (cherry picked from FBD24943911)	2020-11-17 18:40:19 -08:00
Amir Ayupov	6401af89c7	[BOLT] Support jump tables in split fragments with entries pointing back to parent functions Summary: Support jump tables belonging to split fragments with entries pointing back to parent functions. While skipping such families of functions, make sure to use the topmost fragment to ignore its fragments. (cherry picked from FBD24907438)	2020-11-12 11:54:51 -08:00
Amir Ayupov	c0cb550536	Minimize X86/shrinkwrapping-critedge test case Summary: Minimized test case while preserving the CFG subgraph with an issue (cherry picked from FBD24871063)	2020-11-10 21:22:57 -08:00
Amir Ayupov	e54d389799	[BOLT] Disable DynoStats printing after SCTC Summary: Introduce new BinaryFunction flag `IsCanonicalCFG`, which gets unset by SCTC pass. Make DynoStats collection conditional on this new flag. SCTC leaves CFG in a state where branch counters of BBs with tail calls/conditional tail calls are not available (except via annotations, which get stripped by `lower-annotations`). Without branch counters, DynoStats are invalid. (cherry picked from FBD24558050)	2020-11-10 10:51:23 -08:00
Amir Ayupov	2b09d672ce	Conservatively handle jump tables in split functions Summary: - Allow jump table entries to point to locations inside the function and its fragments. Reasoning behind this is that jump table identification has the logic of stopping at entry which belongs to a function different from the one originally referencing jump table. This assumption is invalid for jump tables with entries pointing to both parent function and cold fragments, leading to "unclaimed PC-relative relocations" assertion. - Add fragment identification heuristic based on function name regex and contiguous jump table entries. Currently, parent-to-fragment relationship is set up based on interprocedural references – direct references from the parent function. These references don't include references through jump table. Additionally, some fragments are only reachable through jump table. In that case, in order to fully consume jump table, add parent-to-fragment relationship during `analyzeJumpTable` using the following heuristics: 1. Fragment is identified as such based on name (contains `.cold.` part), but 2. Parent function is not set – no direct interprocedural references to that fragment, and 3. Fragment has the name of the form <parent>.cold(.\d+) * For split functions with jump table entries spanning parent and fragments, mark parent and all fragments as ignored. (cherry picked from FBD24456904)	2020-11-06 11:19:03 -08:00
Amir Ayupov	dc48354f71	processInterproceduralReferences: record references to cold fragments as entry points Summary: For interprocedural references to fragments, record them as fragment entry points. Not registering these entry points leads to UCE removing the blocks and "Undefined temporary symbol" assertion. (cherry picked from FBD24511281)	2020-11-06 10:57:47 -08:00
Rafael Auler	e4396c41da	[BOLT] Ignore __hot_start, __hot_end from input Summary: When -hot-text is on, do not read __hot_start and __hot_end from input (inserted by a linker script with the intent of ordering functions). This can confuse BOLT into creating a function with this name depending on which address the symbol lands and we will assert when trying to emit our own __hot_start/__hot_end with symbol redefinition. (cherry picked from FBD24366636)	2020-10-17 00:50:27 -07:00
Rafael Auler	0b6df06e04	[BOLT] In shrinkwrap, do not split prefix/instr Summary: When placing restore instructions in the shrink wrapping pass, we typically put them right before the last instruction of a block at the dominance frontier. If this instruction happened to have a prefix, because the MC lib separates prefix into separate MCInsts, we would accidentally put a load between a prefix and another instruction. Fix this. (cherry picked from FBD24295324)	2020-10-14 12:40:33 -07:00
Rafael Auler	d7fb998637	[BOLT] Fix sign issue when validating X86 relocations Summary: In analyzeRelocations, we extract the result of the relocation from binary code to recreate the target of it in a few special cases. For R_X86_64_32S relocations, however, we were neglecting the possibility of the encoded value in the instruction to be negative. (cherry picked from FBD24096347)	2020-10-05 12:41:03 -07:00
Amir Ayupov	8c4ba8f165	Bugfix for splitting critical edges in shrink wrapping Summary: Fix issue with splitting critical edges originating at the same BB in ShrinkWrapping::splitFrontierCritEdges. Splitting of critical edges originating at the same FromBB wasn't handled correctly as the Frontier at index corresponding to FromBB was overwritten with basic blocks created for multiple DestinationBBs. (cherry picked from FBD23232398)	2020-08-20 19:00:29 -07:00
Rafael Auler	c6799a689d	[BOLT] Fix stack alignment for runtime lib Summary: Right now, the SAVE_ALL sequence executed upon entry of both of our runtime libs (hugify and instrumentation) will cause the stack to not be aligned at a 16B boundary because it saves 15 8-byte regs. Change the code sequence to adjust for that. The compiler may generate code that assumes the stack is aligned by using movaps instructions, which will crash. (cherry picked from FBD22744307)	2020-07-27 16:52:51 -07:00
Rafael Auler	ed02946281	[BOLT] Fix hot_end symbol update with user function order Summary: If no profile data is provided, but only a user-provided order file for functions, fix the placement of the __hot_end symbol. (cherry picked from FBD22713265)	2020-07-24 10:28:36 -07:00
Rafael Auler	170f73ac9e	[BOLT] Fix fix-branches in presence of JRCXZ and friends Summary: Do not fail/assert when trying to reorder blocks that terminate with JRCXZ/JECXZ/LOOP instructions. We cannot invert the condition of these instructions, so just treat them accordingly in fixBranches(). (cherry picked from FBD22487107)	2020-07-15 23:02:58 -07:00
Rafael Auler	26ad0bd951	[TESTS] Re-add issue20/issue26 tests Summary: Re-add tests removed because they used to depend on yaml2obj. Rewrite them with an assembler (llvm-mc) and use the system linker to produce a valid ELF as input to BOLT. (cherry picked from FBD22323449)	2020-06-30 18:36:49 -07:00
Rafael Auler	41cb6b68ed	Update X86/pre-aggregated-perf.test Summary: Add REQUIRED statement. (cherry picked from FBD22290759)	2020-06-24 18:24:07 -07:00
Maksim Panchenko	5296b6d12a	[BOLT] Change symbol handling for secondary function entries Summary: Some functions could be called at an address inside their function body. Typically, these functions are written in assembly as C/C++ does not have a multi-entry function concept. The addresses inside a function body that could be referenced from outside are called secondary entry points. In BOLT we support processing functions with secondary/multiple entry points. We used to mark basic blocks representing those entry points with a special flag. There was only one problem - each basic block has exactly one MCSymbol associated with it, and for the most efficient processing we prefer that symbol to be local/temporary. However, in certain scenarios, e.g. when running in non-relocation mode, we need the entry symbol to be global/non-temporary. We could create global symbols for secondary points ahead of time when the entry point is marked in the symbol table. But not all such entries are properly marked. This means that potentially we could discover an entry point only after disassembling the code that references it, and it could happen after a local label was already created at the same location together with all its references. Replacing the local symbol and updating the references turned out to be an error-prone process. This diff takes a different approach. All basic blocks are created with permanently local symbols. Whenever there's a need to add a secondary entry point, we create an extra global symbol or use an existing one at that location. Containing BinaryFunction maps a local symbol of a basic block to the global symbol representing a secondary entry point. This way we can tell if the basic block is a secondary entry point, and we emit both symbols for all secondary entry points. Since secondary entry points are quite rare, the overhead of this approach is minimal. Note that the same location could be referenced via local symbol from inside a function and via global entry point symbol from outside. This is true for both primary and secondary entry points. (cherry picked from FBD21150193)	2020-04-19 22:29:54 -07:00
Rafael Auler	6dbd15bc01	[BOLT-X86] Fix instrumentation issue with indirect calls Summary: Indirect calls that use RSP to compute the target address would break in instrumentation mode because we were adding instructions that changed the stack pointer. Fix this. (cherry picked from FBD20883791)	2020-04-06 17:38:11 -07:00
Rafael Auler	340da8f294	[BOLT] Fix shrink wrapping to check pops Summary: Shrink wrapping has a mode where it will directly move push pop pairs, instead of replacing them with stores/loads. This is an ambitious mode that is triggered sometimes, but whenever matching with a push, it would operate with the assumption that the restoring instruction was a pop, not a load, otherwise it would assert. Fix this assertion to bail nicely back to non-pushpop mode (use regular store and load instructions). (cherry picked from FBD20085905)	2020-02-18 16:00:40 -08:00
Xin-Xin Wang	d87f95065a	[BOLT] Add missing CMake test dependencies Summary: I noticed when setting up a new repository for bolt that bolt tests would fail unexpectedly when running `ninja check-bolt` and `ninja check-llvm`. This turns out to be because dependencies for bolt binaries were not specified in the CMake configuration so they were not built before running the tests. This diff adds the dependencies to the CMake configuration for check-bolt and check-llvm so that bolt binaries are built before running tests. (cherry picked from FBD17919505)	2019-10-14 16:03:54 -07:00
Rafael Auler	ddfcf4f266	[BOLT] Add parser for pre-aggregated perf data Summary: The regular perf2bolt aggregation job is to read perf output directly. However, if the data is coming from a database instead of perf, one could write a query to produce a pre-aggregated file. This function deals with this case. The pre-aggregated file contains aggregated LBR data, but without binary knowledge. BOLT will parse it and, using information from the disassembled binary, augment it with fall-through edge frequency information. After this step is finished, this data can be either written to disk to be consumed by BOLT later, or can be used by BOLT immediately if kept in memory. File format syntax: {B\|F\|f} [<start_id>:]<start_offset> [<end_id>:]<end_offset> <count> [<mispred_count>] B - indicates an aggregated branch F - an aggregated fall-through (trace) f - an aggregated fall-through with external origin - used to disambiguate between a return hitting a basic block head and a regular internal jump to the block <start_id> - build id of the object containing the start address. We can skip it for the main binary and use "X" for an unknown object. This will save some space and facilitate human parsing. <start_offset> - hex offset from the object base load address (0 for the main executable unless it's PIE) to the start address. <end_id>, <end_offset> - same for the end address. <count> - total aggregated count of the branch or a fall-through. <mispred_count> - the number of times the branch was mispredicted. Omitted for fall-throughs. Example F 41be50 41be50 3 F 41be90 41be90 4 f 41be90 41be90 7 B 4b1942 39b57f0 3 0 B 4b196f 4b19e0 2 0 (cherry picked from FBD8887182)	2018-07-17 18:31:46 -07:00
Rafael Auler	12380b8b06	Fix assembly after adding entry points Summary: When a given function B, located after function A, references one of A's basic blocks, it registers a new global symbol at the reference address and update A's Labels vector via BinaryFunction::addEntryPoint(). However, we don't update A's branch targets at this point. So we end up with an inconsistent CFG, where the basic block names are global symbols, but the internal branch operands are still referencing the old local name of the corresponding blocks that got promoted to an entry point. This patch fix this by detecting this situation in addEntryPoint and iterating over all instructions, looking for references to the old symbol and replacing them to use the new global symbol (since this is now an entry point). Fixes facebookincubator/BOLT#26 (cherry picked from FBD8728407)	2018-07-03 11:57:46 -07:00
Rafael Auler	544d1577c1	Avoid removing BBs referenced by JTs Summary: While removing unreachable blocks, we may decide to remove a block that is listed as a target in a jump table entry. If we do that, this label will be then undefined and LLVM assembler will crash. Mitigate this for now by not removing such blocks, as we don't support removing unnecessary jump tables yet. Fixes facebookincubator/BOLT#20 (cherry picked from FBD8730269)	2018-07-03 17:02:33 -07:00
Rafael Auler	8f717dd25e	[BOLT] Add initial bolt-only test infra Summary: Create folders and setup to make LIT run BOLT-only tests. Add a test example. This will add a new make/ninja rule "check-bolt" that the user can invoke to run LIT on this folder. (cherry picked from FBD8595786)	2018-06-22 13:50:07 -07:00

1 2 3 4 5

216 Commits