llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexander Yermolovich	06959eedcf	Fix up test for Update DW_AT_stmt_list for .debug_types Summary: As titled. (cherry picked from FBD28112186)	2021-03-17 17:08:26 -07:00
Alexander Yermolovich	0ec91a25df	Update DW_AT_stmt_list for .debug_types Summary: There is no real link between CU and TU, so relying on fact that address are the same, and we are updating all of them. (cherry picked from FBD28112114)	2021-02-17 15:30:10 -08:00
Amir Ayupov	1c5d3a056c	Rebase: Merge BOLT codebase in monorepo Summary: This commit is the first step in rebasing all of BOLT history in the LLVM monorepo. It also solves trivial build issues by updating BOLT codebase to use current LLVM. There is still work left in rebasing some BOLT features and in making sure everything is working as intended. History has been rewritten to put BOLT in the /bolt folder, as opposed to /tools/llvm-bolt. (cherry picked from FBD33289252)	2020-12-01 16:29:39 -08:00
Rafael Auler	e0261a22ce	[TEST] Remove dependency on debug output Summary: Test mistakenly used -debug output, which makes it fail on no-asserts build. (cherry picked from FBD25399449)	2020-12-09 12:25:58 -08:00
Rafael Auler	d2f68039bc	[BOLT] Fix shrinkwrapping bug when changing frame alignment Summary: This fixes a bug with shrink wrapping when trying to move push-pops in a function where we are not allowed to modify the stack layout for alignment reasons. In this bug, we failed to propagate alignment requirement upwards in the call graph from function A to B when: (1) there is a cycle in the call graph and (2) the distance from A to B is greater than 1 in the call graph and (3) there is a node in the path from A to B, not including A or B, that does not access parameters in the stack. (cherry picked from FBD25315977)	2020-12-03 20:09:32 -08:00
Amir Ayupov	f9d00d418b	[BOLT] Handle insertion of updated CFI at the first basic block Summary: Fix corner case of insertion of updated CFI with unset `PrevBB`. Handle it in the same way as inserting past hot-cold split point. (cherry picked from FBD24943911)	2020-11-17 18:40:19 -08:00
Amir Ayupov	6401af89c7	[BOLT] Support jump tables in split fragments with entries pointing back to parent functions Summary: Support jump tables belonging to split fragments with entries pointing back to parent functions. While skipping such families of functions, make sure to use the topmost fragment to ignore its fragments. (cherry picked from FBD24907438)	2020-11-12 11:54:51 -08:00
Amir Ayupov	c0cb550536	Minimize X86/shrinkwrapping-critedge test case Summary: Minimized test case while preserving the CFG subgraph with an issue (cherry picked from FBD24871063)	2020-11-10 21:22:57 -08:00
Amir Ayupov	e54d389799	[BOLT] Disable DynoStats printing after SCTC Summary: Introduce new BinaryFunction flag `IsCanonicalCFG`, which gets unset by SCTC pass. Make DynoStats collection conditional on this new flag. SCTC leaves CFG in a state where branch counters of BBs with tail calls/conditional tail calls are not available (except via annotations, which get stripped by `lower-annotations`). Without branch counters, DynoStats are invalid. (cherry picked from FBD24558050)	2020-11-10 10:51:23 -08:00
Amir Ayupov	2b09d672ce	Conservatively handle jump tables in split functions Summary: - Allow jump table entries to point to locations inside the function and its fragments. Reasoning behind this is that jump table identification has the logic of stopping at entry which belongs to a function different from the one originally referencing jump table. This assumption is invalid for jump tables with entries pointing to both parent function and cold fragments, leading to "unclaimed PC-relative relocations" assertion. - Add fragment identification heuristic based on function name regex and contiguous jump table entries. Currently, parent-to-fragment relationship is set up based on interprocedural references – direct references from the parent function. These references don't include references through jump table. Additionally, some fragments are only reachable through jump table. In that case, in order to fully consume jump table, add parent-to-fragment relationship during `analyzeJumpTable` using the following heuristics: 1. Fragment is identified as such based on name (contains `.cold.` part), but 2. Parent function is not set – no direct interprocedural references to that fragment, and 3. Fragment has the name of the form <parent>.cold(.\d+) * For split functions with jump table entries spanning parent and fragments, mark parent and all fragments as ignored. (cherry picked from FBD24456904)	2020-11-06 11:19:03 -08:00
Amir Ayupov	dc48354f71	processInterproceduralReferences: record references to cold fragments as entry points Summary: For interprocedural references to fragments, record them as fragment entry points. Not registering these entry points leads to UCE removing the blocks and "Undefined temporary symbol" assertion. (cherry picked from FBD24511281)	2020-11-06 10:57:47 -08:00
Rafael Auler	e4396c41da	[BOLT] Ignore __hot_start, __hot_end from input Summary: When -hot-text is on, do not read __hot_start and __hot_end from input (inserted by a linker script with the intent of ordering functions). This can confuse BOLT into creating a function with this name depending on which address the symbol lands and we will assert when trying to emit our own __hot_start/__hot_end with symbol redefinition. (cherry picked from FBD24366636)	2020-10-17 00:50:27 -07:00
Rafael Auler	0b6df06e04	[BOLT] In shrinkwrap, do not split prefix/instr Summary: When placing restore instructions in the shrink wrapping pass, we typically put them right before the last instruction of a block at the dominance frontier. If this instruction happened to have a prefix, because the MC lib separates prefix into separate MCInsts, we would accidentally put a load between a prefix and another instruction. Fix this. (cherry picked from FBD24295324)	2020-10-14 12:40:33 -07:00
Rafael Auler	d7fb998637	[BOLT] Fix sign issue when validating X86 relocations Summary: In analyzeRelocations, we extract the result of the relocation from binary code to recreate the target of it in a few special cases. For R_X86_64_32S relocations, however, we were neglecting the possibility of the encoded value in the instruction to be negative. (cherry picked from FBD24096347)	2020-10-05 12:41:03 -07:00
Amir Ayupov	8c4ba8f165	Bugfix for splitting critical edges in shrink wrapping Summary: Fix issue with splitting critical edges originating at the same BB in ShrinkWrapping::splitFrontierCritEdges. Splitting of critical edges originating at the same FromBB wasn't handled correctly as the Frontier at index corresponding to FromBB was overwritten with basic blocks created for multiple DestinationBBs. (cherry picked from FBD23232398)	2020-08-20 19:00:29 -07:00
Rafael Auler	c6799a689d	[BOLT] Fix stack alignment for runtime lib Summary: Right now, the SAVE_ALL sequence executed upon entry of both of our runtime libs (hugify and instrumentation) will cause the stack to not be aligned at a 16B boundary because it saves 15 8-byte regs. Change the code sequence to adjust for that. The compiler may generate code that assumes the stack is aligned by using movaps instructions, which will crash. (cherry picked from FBD22744307)	2020-07-27 16:52:51 -07:00
Rafael Auler	ed02946281	[BOLT] Fix hot_end symbol update with user function order Summary: If no profile data is provided, but only a user-provided order file for functions, fix the placement of the __hot_end symbol. (cherry picked from FBD22713265)	2020-07-24 10:28:36 -07:00
Rafael Auler	170f73ac9e	[BOLT] Fix fix-branches in presence of JRCXZ and friends Summary: Do not fail/assert when trying to reorder blocks that terminate with JRCXZ/JECXZ/LOOP instructions. We cannot invert the condition of these instructions, so just treat them accordingly in fixBranches(). (cherry picked from FBD22487107)	2020-07-15 23:02:58 -07:00
Rafael Auler	26ad0bd951	[TESTS] Re-add issue20/issue26 tests Summary: Re-add tests removed because they used to depend on yaml2obj. Rewrite them with an assembler (llvm-mc) and use the system linker to produce a valid ELF as input to BOLT. (cherry picked from FBD22323449)	2020-06-30 18:36:49 -07:00
Rafael Auler	41cb6b68ed	Update X86/pre-aggregated-perf.test Summary: Add REQUIRED statement. (cherry picked from FBD22290759)	2020-06-24 18:24:07 -07:00
Maksim Panchenko	5296b6d12a	[BOLT] Change symbol handling for secondary function entries Summary: Some functions could be called at an address inside their function body. Typically, these functions are written in assembly as C/C++ does not have a multi-entry function concept. The addresses inside a function body that could be referenced from outside are called secondary entry points. In BOLT we support processing functions with secondary/multiple entry points. We used to mark basic blocks representing those entry points with a special flag. There was only one problem - each basic block has exactly one MCSymbol associated with it, and for the most efficient processing we prefer that symbol to be local/temporary. However, in certain scenarios, e.g. when running in non-relocation mode, we need the entry symbol to be global/non-temporary. We could create global symbols for secondary points ahead of time when the entry point is marked in the symbol table. But not all such entries are properly marked. This means that potentially we could discover an entry point only after disassembling the code that references it, and it could happen after a local label was already created at the same location together with all its references. Replacing the local symbol and updating the references turned out to be an error-prone process. This diff takes a different approach. All basic blocks are created with permanently local symbols. Whenever there's a need to add a secondary entry point, we create an extra global symbol or use an existing one at that location. Containing BinaryFunction maps a local symbol of a basic block to the global symbol representing a secondary entry point. This way we can tell if the basic block is a secondary entry point, and we emit both symbols for all secondary entry points. Since secondary entry points are quite rare, the overhead of this approach is minimal. Note that the same location could be referenced via local symbol from inside a function and via global entry point symbol from outside. This is true for both primary and secondary entry points. (cherry picked from FBD21150193)	2020-04-19 22:29:54 -07:00
Rafael Auler	6dbd15bc01	[BOLT-X86] Fix instrumentation issue with indirect calls Summary: Indirect calls that use RSP to compute the target address would break in instrumentation mode because we were adding instructions that changed the stack pointer. Fix this. (cherry picked from FBD20883791)	2020-04-06 17:38:11 -07:00
Rafael Auler	340da8f294	[BOLT] Fix shrink wrapping to check pops Summary: Shrink wrapping has a mode where it will directly move push pop pairs, instead of replacing them with stores/loads. This is an ambitious mode that is triggered sometimes, but whenever matching with a push, it would operate with the assumption that the restoring instruction was a pop, not a load, otherwise it would assert. Fix this assertion to bail nicely back to non-pushpop mode (use regular store and load instructions). (cherry picked from FBD20085905)	2020-02-18 16:00:40 -08:00
Xin-Xin Wang	d87f95065a	[BOLT] Add missing CMake test dependencies Summary: I noticed when setting up a new repository for bolt that bolt tests would fail unexpectedly when running `ninja check-bolt` and `ninja check-llvm`. This turns out to be because dependencies for bolt binaries were not specified in the CMake configuration so they were not built before running the tests. This diff adds the dependencies to the CMake configuration for check-bolt and check-llvm so that bolt binaries are built before running tests. (cherry picked from FBD17919505)	2019-10-14 16:03:54 -07:00
Rafael Auler	ddfcf4f266	[BOLT] Add parser for pre-aggregated perf data Summary: The regular perf2bolt aggregation job is to read perf output directly. However, if the data is coming from a database instead of perf, one could write a query to produce a pre-aggregated file. This function deals with this case. The pre-aggregated file contains aggregated LBR data, but without binary knowledge. BOLT will parse it and, using information from the disassembled binary, augment it with fall-through edge frequency information. After this step is finished, this data can be either written to disk to be consumed by BOLT later, or can be used by BOLT immediately if kept in memory. File format syntax: {B\|F\|f} [<start_id>:]<start_offset> [<end_id>:]<end_offset> <count> [<mispred_count>] B - indicates an aggregated branch F - an aggregated fall-through (trace) f - an aggregated fall-through with external origin - used to disambiguate between a return hitting a basic block head and a regular internal jump to the block <start_id> - build id of the object containing the start address. We can skip it for the main binary and use "X" for an unknown object. This will save some space and facilitate human parsing. <start_offset> - hex offset from the object base load address (0 for the main executable unless it's PIE) to the start address. <end_id>, <end_offset> - same for the end address. <count> - total aggregated count of the branch or a fall-through. <mispred_count> - the number of times the branch was mispredicted. Omitted for fall-throughs. Example F 41be50 41be50 3 F 41be90 41be90 4 f 41be90 41be90 7 B 4b1942 39b57f0 3 0 B 4b196f 4b19e0 2 0 (cherry picked from FBD8887182)	2018-07-17 18:31:46 -07:00
Rafael Auler	12380b8b06	Fix assembly after adding entry points Summary: When a given function B, located after function A, references one of A's basic blocks, it registers a new global symbol at the reference address and update A's Labels vector via BinaryFunction::addEntryPoint(). However, we don't update A's branch targets at this point. So we end up with an inconsistent CFG, where the basic block names are global symbols, but the internal branch operands are still referencing the old local name of the corresponding blocks that got promoted to an entry point. This patch fix this by detecting this situation in addEntryPoint and iterating over all instructions, looking for references to the old symbol and replacing them to use the new global symbol (since this is now an entry point). Fixes facebookincubator/BOLT#26 (cherry picked from FBD8728407)	2018-07-03 11:57:46 -07:00
Rafael Auler	544d1577c1	Avoid removing BBs referenced by JTs Summary: While removing unreachable blocks, we may decide to remove a block that is listed as a target in a jump table entry. If we do that, this label will be then undefined and LLVM assembler will crash. Mitigate this for now by not removing such blocks, as we don't support removing unnecessary jump tables yet. Fixes facebookincubator/BOLT#20 (cherry picked from FBD8730269)	2018-07-03 17:02:33 -07:00
Rafael Auler	8f717dd25e	[BOLT] Add initial bolt-only test infra Summary: Create folders and setup to make LIT run BOLT-only tests. Add a test example. This will add a new make/ninja rule "check-bolt" that the user can invoke to run LIT on this folder. (cherry picked from FBD8595786)	2018-06-22 13:50:07 -07:00

28 Commits