llvm-project

Commit Graph

Author	SHA1	Message	Date
Vladislav Khmelevsky	fd9604952d	[BOLT] Set valid index for functions with profiles Some of the passes that calculates tentative layout like LongJmp and Golang are expecting that only functions with valid index will be located in hot text section. But currently functions with valid profiles and not set index are breaking this logic, to fix this we can move the hasValidProfile() condition from AssignSections pass to ReorderFunctions. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D127223	2022-06-08 14:13:12 +03:00
Fangrui Song	b92436efcb	[bolt] Remove unneeded cl::ZeroOrMore for cl::opt options	2022-06-05 13:29:49 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to `557efc9a8b`. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Huan Nguyen	5ac26156fe	[BOLT][NFC] Warning for deprecated option '-reorder-blocks=cache+' Emit warning when using deprecated option '-reorder-blocks=cache+'. Auto switch to option '-reorder-blocks=ext-tsp'. Test Plan: ``` ninja check-bolt ``` Added a new test cache+-deprecated.test. Run and verify that the upstream tests are passed. Reviewed By: rafauler, Amir, maksfb Differential Revision: https://reviews.llvm.org/D126722	2022-06-03 14:16:55 -07:00
spupyrev	5904836b8a	[BOLT] Cache-Aware Tail Duplication A new "cache-aware" strategy for tail duplication. Differential Revision: https://reviews.llvm.org/D123050	2022-06-03 09:08:45 -07:00
Amir Ayupov	e2142ff47c	[BOLT][NFC] Make ICP::verifyProfile static Follow LLVM style guide suggestion to avoid function definitions in anonymous namespaces: https://llvm.org/docs/CodingStandards.html#anonymous-namespaces Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124896	2022-06-02 19:09:29 -07:00
Balazs Benics	a73b50ad06	Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable" This reverts commit `3988bd1398`. Did not build on this bot: https://lab.llvm.org/buildbot#builders/215/builds/6372 /usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to ‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock>&, const std::pair<long unsigned int, std::nullptr_t>&)’ 177 \| { return bool(_M_comp(__it, __val)); }	2022-05-27 11:19:18 +02:00
Balazs Benics	3988bd1398	[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable One could reuse this functor instead of rolling out your own version. There were a couple other cases where the code was similar, but not quite the same, such as it might have an assertion in the lambda or other constructs. Thus, I've not touched any of those, as it might change the behavior in some way. As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal Chris Lattner > LLVM intentionally has a “yes, you can apply common sense judgement to > things” policy when it comes to code review. If you are doing mechanical > patches (e.g. adopting less_first) that apply to the entire monorepo, > then you don’t need everyone in the monorepo to sign off on it. Having > some +1 validation from someone is useful, but you don’t need everyone > whose code you touch to weigh in. Differential Revision: https://reviews.llvm.org/D126068	2022-05-27 11:15:23 +02:00
Rafael Auler	c09cd64e5c	[BOLT] Fix AND evaluation bug in shrink wrapping Fix a bug where shrink-wrapping would use wrong stack offsets because the stack was being aligned with an AND instruction, hence, making its true offsets only available during runtime (we can't statically determine where are the stack elements and we must give up on this case). Reviewed By: Amir Differential Revision: https://reviews.llvm.org/D126110	2022-05-26 14:59:28 -07:00
Amir Ayupov	bdba3d091c	[BOLT][CMAKE] Fix DYLIB build Move BOLT libraries out of `LLVM_LINK_COMPONENTS` to `target_link_libraries`. Addresses issue #55432. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125568	2022-05-13 13:27:21 -07:00
Amir Ayupov	139744ac53	[BOLT][NFC] Suppress unused variable warnings Address warnings in Release build without assertions. Tip @tschuett for reporting the issue #55404. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125475	2022-05-13 20:10:19 +01:00
Amir Ayupov	d63c5a38fe	[BOLT][NFC] Use BitVector::set_bits Refactor and use `set_bits` BitVector interface. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D125374	2022-05-11 16:23:44 -07:00
Amir Ayupov	8cb7a873ab	[BOLT][NFC] Add MCPlus::primeOperands iterator_range Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D125397	2022-05-11 09:34:51 -07:00
Amir Ayupov	c2d40f1dfb	[BOLT] Add icp-inline option Add an option to only peel ICP targets that can be subsequently inlined. Yet there's no guarantee that they will be inlined. The mode is independent from the heuristic used to choose ICP targets: by exec count, mispredictions, or memory profile. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124900	2022-05-11 03:21:24 -07:00
Amir Ayupov	f8d2d8b587	[BOLT][NFC] Move getInliningInfo out of Inliner class `getInliningInfo` is useful in other passes that need to check inlining eligibility for some function. Move the declaration and InliningInfo definition out of Inliner class. Prepare for subsequent use in ICP. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124899	2022-05-04 14:08:06 -07:00
Amir Ayupov	2ad1c7540e	[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite Minor refactoring. NFC. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124898	2022-05-04 14:06:53 -07:00
Amir Ayupov	60957a5a08	[BOLT] Fix ICPJumpTablesTopN option use Fix non-sensical `opts::ICPJumpTablesTopN != 0 ? opts::ICPTopN : opts::ICPTopN`. Refactor/simplify another similar assignment. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124880	2022-05-03 19:34:10 -07:00
Amir Ayupov	c3d5372093	[BOLT][NFC] Make ICP options naming uniform Rename `opts::IndirectCallPromotion` to `opts::ICP`, making option naming uniform and easier to follow. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124879	2022-05-03 19:32:45 -07:00
Amir Ayupov	d0b1c98c96	[BOLT][NFC] ICP: simplify findTargetsIndex Unnest lambda and use `llvm::is_contained`. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124877	2022-05-03 19:31:20 -07:00
Amir Ayupov	ec02227bf7	[BOLT][NFC] Refactor ICP::findCallTargetSymbols Reduce nesting making it easier to read. Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D124876	2022-05-03 19:29:22 -07:00
Paul Kirth	a0b8ab1ba3	[BOLT][NFC] Fix warning for unqualified call to std::move Fixes warning from RetpolineInsertion.cpp:171:44: warning: unqualified call to std::move [-Wunqualified-std-cast-call] Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D124482	2022-04-26 23:18:20 +00:00
Amir Ayupov	bad3798113	[BOLT] Fix data race in shortenInstructions Address ThreadSanitizer warning Reviewed By: maksfb Differential Revision: https://reviews.llvm.org/D121338	2022-04-13 11:10:36 -07:00
Vladislav Khmelevsky	4c14519ecb	[BOLT] LongJmp: Check for shouldEmit Check that the function will be emitted in the final binary. Preserving old function address is needed in case it is PLT trampiline, that is currently not moved by the BOLT. Differential Revision: https://reviews.llvm.org/D122098	2022-03-31 22:33:09 +03:00
Vladislav Khmelevsky	fed958c6cc	[BOLT] AArch64: Emit text objects BOLT treats aarch64 objects located in text as empty functions with contant islands. Emit them with at least 8-byte alignment to the new text section. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D122097	2022-03-31 22:28:50 +03:00
Vladislav Khmelevsky	af9bdcfc46	[BOLT] Align constant islands to 8 bytes AArch64 requires CI to be aligned to 8 bytes due to access instructions restrictions. E.g. the ldr with imm, where imm must be aligned to 8 bytes. Differential Revision: https://reviews.llvm.org/D122065	2022-03-27 22:30:42 +03:00
spupyrev	4609f60ebc	[BOLT] Avoid pointless loop rotation It seems the earlier implementation does not follow the description in LoopRotationPass.h: It rotates loops even if they are already laid out correctly. The diff adjusts the behaviour. Given that the impact of LoopInversionPass is minor, this change won't yield significant perf differences. Tested on clang-10: there seems to be a 0.1%-0.3% cpu win and a small reduction of branch misses. Before: BOLT-INFO: 120 Functions were reordered by LoopInversionPass After: BOLT-INFO: 79 Functions were reordered by LoopInversionPass Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D121921	2022-03-22 12:42:42 -07:00
Vladislav Khmelevsky	5be5d0f56e	[BOLT] LongJmp speedup refactoring Run tentativeLayoutRelocMode twice only if UseOldText option was passed. Refactor BF loop to break on condtition met. Differential Revision: https://reviews.llvm.org/D121825	2022-03-18 16:16:47 +03:00
Amir Ayupov	dc1cf838a5	[BOLT] Strip redundant AdSize override prefix Since LLVM MC now preserves redundant AdSize override prefix (0x67), remove it in BOLT explicitly (-x86-strip-redundant-adsize, on by default). Test Plan: `bin/llvm-lit -a bolt/test/X86/addr32.s` Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D120975	2022-03-16 09:38:17 -07:00
Vladislav Khmelevsky	62a289d85c	[BOLT] LongJmp: Fix hot text section alignment The BinaryEmitter uses opts::AlignText value to align the hot text section. Also check that the opts::AlignText is at least equal opts::AlignFunctions for the same reason, as described in D121392. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D121728	2022-03-16 15:57:46 +03:00
Vladislav Khmelevsky	8ab69baad5	[BOLT] Set cold sections alignment explicitly The cold text section alignment is set using the maximum alignment value passed to the emitCodeAlignment. In order to calculate tentetive layout right we will set the minimum alignment of such sections to the maximum possible function alignment explicitly. Differential Revision: https://reviews.llvm.org/D121392	2022-03-15 22:12:17 +03:00
Vladislav Khmelevsky	04b87cf0e7	[BOLT] LongJmp: Use per-function alignment values The per-function alignment values must be used in order to create tentative layout. Differential Revision: https://reviews.llvm.org/D121298	2022-03-10 19:48:48 +03:00
Amir Ayupov	687e4af1c0	[BOLT] CMOVConversion pass Convert simple hammocks into cmov based on misprediction rate. Test Plan: - Assembly test: `cmov-conversion.s` - Testing on a binary: # Bootstrap clang with `-x86-cmov-converter-force-all` and `-Wl,--emit-relocs` (Release build) # Collect perf.data: - `clang++ <opts> bolt/lib/Core/BinaryFunction.cpp -E > bf.cpp` - `perf record -e cycles:u -j any,u -- clang-15 bf.cpp -O2 -std=c++14 -c -o bf.o` # Optimize clang-15 with and w/o -cmov-conversion: - `llvm-bolt clang-15 -p perf.data -o clang-15.bolt` - `llvm-bolt clang-15 -p perf.data -cmov-conversion -o clang-15.bolt.cmovconv` # Run perf experiment: - test: `clang-15.bolt.cmovconv`, - control: `clang-15.bolt`, - workload (clang options): `bf.cpp -O2 -std=c++14 -c -o bf.o` Results: ``` task-clock [delta: -360.21 ± 356.75, delta(%): -1.7760 ± 1.7589, p-value: 0.047951, balance: -6] instructions [delta: 44061118 ± 13246382, delta(%): 0.0690 ± 0.0207, p-value: 0.000001, balance: 50] icache-misses [delta: -5534468 ± 2779620, delta(%): -0.4331 ± 0.2175, p-value: 0.028014, balance: -28] branch-misses [delta: -1624270 ± 1113244, delta(%): -0.3456 ± 0.2368, p-value: 0.030300, balance: -22] ``` Reviewed By: rafauler Differential Revision: https://reviews.llvm.org/D120177	2022-03-08 10:44:31 -08:00
Amir Ayupov	36ada32727	[BOLT][NFC] Fix data race in ShrinkWrapping stats Fix data race reported by ThreadSanitizer in clang.test: ``` ThreadSanitizer: data race /data/llvm-project/bolt/lib/Passes/ShrinkWrapping.cpp:1359:28 in llvm::bolt::ShrinkWrapping::moveSaveRestores() ``` The issue is with incrementing global counters from multiple threads. Reviewed By: yota9 Differential Revision: https://reviews.llvm.org/D120218	2022-02-20 17:21:58 -08:00
serge-sans-paille	57f7c7d90e	Add missing MC includes in bolt/ Changes needed after `ef736a1c39` that removes some implicit dependencies from MrCV headers.	2022-02-09 08:28:34 -05:00
Vladislav Khmelevsky	19fb5a210d	[BOLT] Add aarch64 support for peephole passes Enable peephole optimizations for aarch64. Also small code refactoring - add PeepholeOpts under Peepholes class. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei Differential Revision: https://reviews.llvm.org/D118732	2022-02-08 03:04:40 +03:00
Amir Ayupov	194b164eb5	[BOLT][NFC] Fix compiler warnings Summary: - variable 'TotalSize' set but not used - variable 'TotalCallsTopN' set but not used - use of bitwise '\|' with boolean operands Reviewed By: maksfb FBD33911129	2022-02-04 15:57:33 -08:00
Amir Ayupov	f8c7fb499b	[BOLT][NFC] Reduce includes with include-what-you-use Summary: Removed redundant includes with IWYU Test Plan: ninja bolt Reviewers: maksfb FBD32043568	2022-01-21 12:05:47 -08:00
Amir Ayupov	5a654b0113	[BOLT] Make ICP target selection (more) deterministic Summary: Break ties by selecting targets with lower addresses. Reviewers: maksfb FBD33677001	2022-01-21 12:03:43 -08:00
Amir Ayupov	f18fcdabda	[BOLT][NFC] Expand auto types pt.2 Summary: Expand autos where it may lead to differences in the BOLT binary. Test Plan: NFC Reviewers: maksfb Reviewed By: maks FBD27673231	2022-01-21 12:02:57 -08:00
Amir Ayupov	a9cd49d50e	[BOLT][NFC] Move Offset annotation to Group 1 Summary: Move the annotation to avoid dynamic memory allocations. Improves the CPU time of instrumenting a large binary by 1% (+-0.8%, p-value 0.01) Test Plan: NFC Reviewers: maksfb FBD30091656	2022-01-18 13:24:50 -08:00
Amir Ayupov	18bc405a09	[BOLT][NFC] Remove uses of `std::vector<bool>` Summary: LLVM Programmer’s Manual strongly discourages the use of `std::vector<bool>` and suggests `llvm::BitVector` as a possible replacement.	2022-01-13 22:46:34 -08:00
Amir Ayupov	b1a107db56	[BOLT][NFC] Format braced initializer lists Summary: Use assignment (`=`) with braced initializer lists when constructing aggregate temporaries in expressions. https://llvm.org/docs/CodingStandards.html#braced-initializer-lists (cherry picked from FBD33515669)	2022-01-10 12:45:55 -08:00
Amir Ayupov	3b01fbebeb	[BOLT] Fix debug logging in IndirectCallPromotion Summary: Access elements of a value pair in HotTargetMap debug logging/loop over HotTargetMap key-value. (cherry picked from FBD33344656)	2021-12-28 16:37:53 -08:00
Amir Ayupov	f92ab6af35	[BOLT][NFC] Fix braces usage in Passes Summary: Refactor bolt/*/Passes to follow the braces rule for if/else/loop from [LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html). (cherry picked from FBD33344642)	2021-12-28 16:36:17 -08:00
Vladislav Khmelevsky	2d84e344d9	[PR][BOLT] Check for end iterator in LongJmp stub lookup Summary: The lower_bound might return the end iterator, the ignoring of which will cause memory corruption. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD33307803)	2021-11-28 02:56:30 +03:00
Maksim Panchenko	e1eeef5b90	[BOLT][RFC] Use new LLVM license for ADRRelaxationPass Summary: Fixes facebookincubator/BOLT#271 (cherry picked from FBD33299273)	2021-12-23 10:49:37 -08:00
Maksim Panchenko	2f09f445b2	[BOLT][NFC] Fix file-description comments Summary: Fix comments at the start of source files. (cherry picked from FBD33274597)	2021-12-21 10:21:41 -08:00
Vladislav Khmelevsky	08f56926c2	[BOLT] Move disassemble optimizations to optimization passes Summary: The patch moves the shortenInstructions and nop remove to separate binary passes. As a result when llvm-bolt optimizations stage will begin the instructions of the binary functions will be absolutely the same as it was in the binary. This is needed for the golang support by llvm-bolt. Some of the tests must be changed, since bb alignment nops might create unreachable BBs in original functions. Vladislav Khmelevsky, Advanced Software Technology Lab, Huawei (cherry picked from FBD32896517)	2021-12-18 17:03:35 -08:00
Maksim Panchenko	40c2e0fafe	[BOLT][NFC] Reformat with clang-format Summary: Selectively apply clang-format to BOLT code base. (cherry picked from FBD33119052)	2021-12-14 16:52:51 -08:00
Maksim Panchenko	69706eafab	[BOLT] Refactor BinaryBasicBlock to use ADT Summary: Refactor members of BinaryBasicBlock. Replace some std containers with ADT equivalents. The size of BinaryBasicBlock on x86-64 Linux is reduced from 232 bytes to 192 bytes. (cherry picked from FBD33081850)	2021-12-09 11:53:12 -08:00

1 2

66 Commits