llvm-project

Commit Graph

Author	SHA1	Message	Date
spupyrev	eecd41aa09	Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type" This reverts commit `6d0528636a`.	2022-07-11 09:50:47 -07:00
Rafael Auler	6d0528636a	Rebase: [Facebook] [MC] Introduce NeverAlign fragment type Summary: Introduce NeverAlign fragment type. The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary. In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion). This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place. The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle. LLVM: https://reviews.llvm.org/D97982 Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: phabricatorlinter Differential Revision: https://phabricator.intern.facebook.com/D31361547	2022-07-11 09:31:52 -07:00
Guillaume Chatelet	412c788ab0	[NFC][Alignment] Use Align in MCAlignFragment	2022-06-15 12:31:00 +00:00
Fangrui Song	689c3a2552	[MC] Fix letter case of some MCSection member functions	2022-03-11 20:07:00 -08:00
serge-sans-paille	ef736a1c39	Cleanup LLVMMC headers There's a few relevant forward declarations in there that may require downstream adding explicit includes: llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h Counting preprocessed lines required to rebuild llvm-project on my setup: before: 1052436830 after: 1049293745 Which is significant and backs up the change in addition to the usual benefits of decreasing coupling between headers and compilation units. Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D119244	2022-02-09 11:09:17 +01:00
Alex Lorenz	0756aa3978	[macho] add support for emitting macho files with two build version load commands This patch extends LLVM IR to add metadata that can be used to emit macho files with two build version load commands. It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that, which will be set by a future patch in clang. MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target, and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support. Differential Revision: https://reviews.llvm.org/D112189	2021-12-07 18:17:47 -08:00
Peter Smith	e63455d5e0	[MC] Use local MCSubtargetInfo in writeNops On some architectures such as Arm and X86 the encoding for a nop may change depending on the subtarget in operation at the time of encoding. This change replaces the per module MCSubtargetInfo retained by the targets AsmBackend in favour of passing through the local MCSubtargetInfo in operation at the time. On Arm using the architectural NOP instruction can have a performance benefit on some implementations. For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to limit the chances of this causing problems in the future. I've not done this for other targets such as X86 as there is more frequent use of the MCSubtargetInfo and it looks to be for stable properties that we would not expect to vary per function. This change required threading STI through MCNopsFragment and MCBoundaryAlignFragment. I've attempted to take into account the in tree experimental backends. Differential Revision: https://reviews.llvm.org/D45962	2021-09-07 15:46:19 +01:00
Saleem Abdulrasool	bbea64250f	RISCV: adjust handling of relocation emission for RISCV This re-architects the RISCV relocation handling to bring the implementation closer in line with the implementation in binutils. We would previously aggressively resolve the relocation. With this restructuring, we always will emit a paired relocation for any symbolic difference of the type of S±T[±C] where S and T are labels and C is a constant. GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE` which indicates that a fixup may be expanded into multiple relocations. This is used by the RISCV backend to always emit a paired relocation - either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] + SUB[WIDTH] for a debug info relocation. Irrespective of whether linker relaxation support is enabled, symbolic difference is always emitted as a paired relocation. This change also sinks the target specific behaviour down into the target specific area rather than exposing it to the shared relocation handling. In the process, we also sink the "special" handling for debug information down into the RISCV target. Although this improves the path for the other targets, this is not necessarily entirely ideal either. The changes in the debug info emission could be done through another type of hook as this functionality would be required by any other target which wishes to do linker relaxation. However, as there are no other targets in LLVM which currently do this, this is a reasonable thing to do until such time as the code needs to be shared. Improve the handling of the relocation (and add a reduced test case from the Linux kernel) to ensure that we handle complex expressions for symbolic difference. This ensures that we correct relocate symbols with the adddends normalized and associated with the addition portion of the paired relocation. This change also addresses some review comments from Alex Bradbury about the relocations meant for use in the DWARF CFA being named incorrectly (using ADD6 instead of SET6) in the original change which introduced the relocation type. This resolves the issues with the symbolic difference emission sufficiently to enable building the Linux kernel with clang+IAS+lld (without linker relaxation). Resolves PR50153, PR50156! Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143 Reviewed By: nickdesaulniers, maskray Differential Revision: https://reviews.llvm.org/D103539	2021-06-17 08:20:02 -07:00
Fangrui Song	71635ea5ff	MCDwarf: Delete uneeded parameter And change signature	2021-01-21 00:55:07 -08:00
Hongtao Yu	705a4c149d	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 17:29:28 -08:00
Mitch Phillips	7ead5f5aa3	Revert "[CSSPGO] Pseudo probe encoding and emission." This reverts commit `b035513c06`. Reason: Broke the ASan buildbots: http://lab.llvm.org:8011/#/builders/5/builds/2269	2020-12-10 15:53:39 -08:00
Hongtao Yu	b035513c06	[CSSPGO] Pseudo probe encoding and emission. This change implements pseudo probe encoding and emission for CSSPGO. Please see RFC here for more context: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s Pseudo probes are in the form of intrinsic calls on IR/MIR but they do not turn into any machine instructions. Instead they are emitted into the binary as a piece of data in standalone sections. The probe-specific sections are not needed to be loaded into memory at execution time, thus they do not incur a runtime overhead. ELF object emission The binary data to emit are organized as two ELF sections, i.e, the `.pseudo_probe_desc` section and the `.pseudo_probe` section. The `.pseudo_probe_desc` section stores a function descriptor for each function and the `.pseudo_probe` section stores the actual probes, each fo which corresponds to an IR basic block or an IR function callsite. A function descriptor is stored as a module-level metadata during the compilation and is serialized into the object file during object emission. Both the probe descriptors and pseudo probes can be emitted into a separate ELF section per function to leverage the linker for deduplication. A `.pseudo_probe` section shares the same COMDAT group with the function code so that when the function is dead, the probes are dead and disposed too. On the contrary, a `.pseudo_probe_desc` section has its own COMDAT group. This is because even if a function is dead, its probes may be inlined into other functions and its descriptor is still needed by the profile generation tool. The format of `.pseudo_probe_desc` section looks like: ``` .section .pseudo_probe_desc,"",@progbits .quad 6309742469962978389 // Func GUID .quad 4294967295 // Func Hash .byte 9 // Length of func name .ascii "_Z5funcAi" // Func name .quad 7102633082150537521 .quad 138828622701 .byte 12 .ascii "_Z8funcLeafi" .quad 446061515086924981 .quad 4294967295 .byte 9 .ascii "_Z5funcBi" .quad -2016976694713209516 .quad 72617220756 .byte 7 .ascii "_Z3fibi" ``` For each `.pseudoprobe` section, the encoded binary data consists of a single function record corresponding to an outlined function (i.e, a function with a code entry in the `.text` section). A function record has the following format : ``` FUNCTION BODY (one for each outlined function present in the text section) GUID (uint64) GUID of the function NPROBES (ULEB128) Number of probes originating from this function. NUM_INLINED_FUNCTIONS (ULEB128) Number of callees inlined into this function, aka number of first-level inlinees PROBE RECORDS A list of NPROBES entries. Each entry contains: INDEX (ULEB128) TYPE (uint4) 0 - block probe, 1 - indirect call, 2 - direct call ATTRIBUTE (uint3) reserved ADDRESS_TYPE (uint1) 0 - code address, 1 - address delta CODE_ADDRESS (uint64 or ULEB128) code address or address delta, depending on ADDRESS_TYPE INLINED FUNCTION RECORDS A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined callees. Each record contains: INLINE SITE GUID of the inlinee (uint64) ID of the callsite probe (ULEB128) FUNCTION BODY A FUNCTION BODY entry describing the inlined function. ``` To support building a context-sensitive profile, probes from inlinees are grouped by their inline contexts. An inline context is logically a call path through which a callee function lands in a caller function. The probe emitter builds an inline tree based on the debug metadata for each outlined function in the form of a trie tree. A tree root is the outlined function. Each tree edge stands for a callsite where inlining happens. Pseudo probes originating from an inlinee function are stored in a tree node and the tree path starting from the root all the way down to the tree node is the inline context of the probes. The emission happens on the whole tree top-down recursively. Probes of a tree node will be emitted altogether with their direct parent edge. Since a pseudo probe corresponds to a real code address, for size savings, the address is encoded as a delta from the previous probe except for the first probe. Variant-sized integer encoding, aka LEB128, is used for address delta and probe index. Assembling Pseudo probes can be printed as assembly directives alternatively. This allows for good assembly code readability and also provides a view of how optimizations and pseudo probes affect each other, especially helpful for diff time assembly analysis. A pseudo probe directive has the following operands in order: function GUID, probe index, probe type, probe attributes and inline context. The directive is generated by the compiler and can be parsed by the assembler to form an encoded `.pseudoprobe` section in the object file. A example assembly looks like: ``` foo2: # @foo2 # %bb.0: # %bb0 pushq %rax testl %edi, %edi .pseudoprobe 837061429793323041 1 0 0 je .LBB1_1 # %bb.2: # %bb2 .pseudoprobe 837061429793323041 6 2 0 callq foo .pseudoprobe 837061429793323041 3 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq .LBB1_1: # %bb1 .pseudoprobe 837061429793323041 5 1 0 callq %rsi .pseudoprobe 837061429793323041 2 0 0 .pseudoprobe 837061429793323041 4 0 0 popq %rax retq # -- End function .section .pseudo_probe_desc,"",@progbits .quad 6699318081062747564 .quad 72617220756 .byte 3 .ascii "foo" .quad 837061429793323041 .quad 281547593931412 .byte 4 .ascii "foo2" ``` With inlining turned on, the assembly may look different around %bb2 with an inlined probe: ``` # %bb.2: # %bb2 .pseudoprobe 837061429793323041 3 0 .pseudoprobe 6699318081062747564 1 0 @ 837061429793323041:6 .pseudoprobe 837061429793323041 4 0 popq %rax retq ``` Disassembling* We have a disassembling tool (llvm-profgen) that can display disassembly alongside with pseudo probes. So far it only supports ELF executable file. An example disassembly looks like: ``` 00000000002011a0 <foo2>: 2011a0: 50 push rax 2011a1: 85 ff test edi,edi [Probe]: FUNC: foo2 Index: 1 Type: Block 2011a3: 74 02 je 2011a7 <foo2+0x7> [Probe]: FUNC: foo2 Index: 3 Type: Block [Probe]: FUNC: foo2 Index: 4 Type: Block [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ foo2:6 2011a5: 58 pop rax 2011a6: c3 ret [Probe]: FUNC: foo2 Index: 2 Type: Block 2011a7: bf 01 00 00 00 mov edi,0x1 [Probe]: FUNC: foo2 Index: 5 Type: IndirectCall 2011ac: ff d6 call rsi [Probe]: FUNC: foo2 Index: 4 Type: Block 2011ae: 58 pop rax 2011af: c3 ret ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91878	2020-12-10 09:50:08 -08:00
Fangrui Song	35be65cb1c	[MC] Fix an assert in MCAssembler::writeSectionData to be aware of errors If MCContext has an error, MCAssembler::layout may stop early and some MCFragment's may not finalize. In the Linux kernel, arch/x86/lib/memcpy_64.S could trigger the assert before "x86_64: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S"	2020-10-29 23:11:18 -07:00
Fangrui Song	45d0343900	[MC] Allow .org directives in SHT_NOBITS sections This is used by kvm-unit-tests and can be trivially supported.	2020-09-11 15:12:42 -07:00
Jian Cai	c6334db577	[X86] support .nops directive Add support of .nops on X86. This addresses llvm.org/PR45788. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D82826	2020-08-03 11:50:56 -07:00
Shengchen Kan	e59e39b7c4	[MC] Simplify the logic of applying fixup for fragments, NFCI Replace mutiple `if else` clauses with a `switch` clause and remove redundant checks. Before this patch, we need to add a statement like `if(!isa<MCxxxFragment>(Frag)) ` here each time we add a new kind of `MCEncodedFragment` even if it has no fixups. After this patch, we don't need to do that. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D83366	2020-07-09 16:39:13 +08:00
Thomas Preud'homme	6c67ee0f58	[MC] Fix PR45805: infinite recursion in assembler Give up folding an expression if the fragment of one of the operands would require laying out a fragment already being laid out. This prevents hitting an infinite recursion when a fill size expression refers to a later fragment since computing the offset of that fragment would require laying out the fill fragment and thus computing its size expression. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D79570	2020-06-25 15:42:36 +01:00
Shengchen Kan	8bb059ab63	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Simon Pilgrim	bcd7f77713	MCObjectWriter.h - remove Endian.h/EndianStream.h/raw_ostream.h includes. NFC Push these includes down to the the writers that actually need them, a number of which were implicitly relying on the MCObjectWriter.h.	2020-04-17 10:44:08 +01:00
Fangrui Song	e13a8a1fc5	[MC][COFF][ELF] Reject instructions in IMAGE_SCN_CNT_UNINITIALIZED_DATA/SHT_NOBITS sections For `.bss; nop`, MC inappropriately calls abort() (via report_fatal_error()) with a message `cannot have fixups in virtual section!` It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData(). Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage `non-zero initializer found in section '.bss'` (see D4199) It'd be nice to report the location but so many directives can call emitIntValue() and it is difficult to track every location. Note, COFF does not crash because MCAssembler::writeSectionData() is not called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section. Note, GNU as' arm64 backend reports ``Error: attempt to store non-zero value in section `.bss'`` for a non-zero .inst but fails to do so for other instructions. We simply reject all instructions, even if the encoding is all zeros. The Mach-O counterpart is D48517 (see `test/MC/MachO/zerofill-text.s`) Reviewed By: rnk, skan Differential Revision: https://reviews.llvm.org/D78138	2020-04-15 21:02:47 -07:00
Fangrui Song	90a63f6d2d	[MC] Replace MCSection*::getName() with MCSection::getName(). NFC I plan to use MCSection::getName() in D78138. Having the function in the base class is also convenient for debugging. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D78251	2020-04-15 18:35:27 -07:00
Fangrui Song	7d1ff446b6	[MC] Rename MCSection::getSectionName() to getName(). NFC A pending change will merge MCSection::getName() to MCSection::getName().	2020-04-15 16:48:14 -07:00
Jian Cai	6a38e0e4f5	[MC] Recalculate fragment offsets after relaxation Summary: The current relaxation implementation is not correctly adjusting the size and offsets of fragements in one section based on changes in size of another if the layout order of the two happened to be such that the former was visited before the later. Therefore, we need to invalidate the fragments in all sections after each iteration of relaxation, and possibly further relax some of them in the next ieration. This fixes PR#45190. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76114	2020-03-17 14:48:05 -07:00
Shengchen Kan	3a503ce663	[X86] Reduce the number of emitted fragments due to branch align Summary: Currently, a BoundaryAlign fragment may be inserted after the branch that needs to be aligned to truncate the current fragment, this fragment is unused at most of time. To avoid that, we can insert a new empty Data fragment instead. Non-relaxable instruction is usually emitted into Data fragment, so the inserted empty Data fragment will be reused at a high possibility. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: reames, LuoYuanke Subscribers: llvm-commits, dexonsmith, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D75438	2020-03-12 15:37:35 +08:00
Shengchen Kan	af57b139a0	Temporarily Revert [X86] Not track size of the boudaryalign fragment during the layout Summary: This reverts commit `2ac19feb15`. This commit causes some test cases to run fail when branch is aligned.	2020-03-03 11:15:56 +08:00
Philip Reames	5565820e6e	Use range-for in MCAssembler [NFC]	2020-03-02 14:57:35 -08:00
Shengchen Kan	2ac19feb15	[X86] Not track size of the boudaryalign fragment during the layout Summary: Currently the boundaryalign fragment caches its size during the process of layout and then it is relaxed and update the size in each iteration. This behaviour is unnecessary and ugly. Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight Reviewed By: MaskRay Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75404	2020-03-02 09:32:30 +08:00
Hans Wennborg	2e24219d3c	[MC][ARM] Resolve some pcrel fixups at assembly time (PR44929) MC currently does not emit these relocation types, and lld does not handle them. Add FKF_Constant as a work-around of some ARM code after D72197. Eventually we probably should implement these relocation types. By Fangrui Song! Differential revision: https://reviews.llvm.org/D72892	2020-02-27 12:43:29 +01:00
Philip Reames	eca4bfea3d	[MC] Pull out a relaxFragment helper [NFC] Having this as it's own function helps to reduce indentation and allows use of return instead of wiring a value over the switch. A lambda would have also worked, but with slightly deeper nesting.	2020-02-26 13:37:12 -08:00
James Clarke	3f5976c97d	[RISCV] Fix evaluating %pcrel_lo against global and weak symbols Summary: Previously, we would erroneously turn %pcrel_lo(label), where label has a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as evaluatePCRelLo would believe the target independent logic was going to fold it. Moreover, even if that were fixed, shouldForceRelocation lacks an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value and check the symbol, so we would then erroneously constant-fold the %pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same sequence also occurs for symbols with global binding, which is triggered in real-world code. Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to avoid these kinds of issues. All the resolution logic happens in one place, with no coordination required between RISCAsmBackend and RISCVMCExpr to ensure they implement the same logic twice. Although the implementation of %pcrel_hi can be left as target independent, we make it target dependent to ensure that they are handled identically to %pcrel_lo, otherwise we risk one of them being constant folded but the other being preserved. This also allows us to properly support fixup pairs where the instructions are in different fragments. Reviewers: asb, lenary, efriedma Reviewed By: efriedma Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73211	2020-01-23 02:05:48 +00:00
Simon Pilgrim	46e2f89364	[MC] writeFragment - assert MCFragment::FT_Fill length is legal. Silence (clang/MSVC) static analyzer warnings that the fragment data may either write out of bounds of the local array or reference uninitialized data.	2020-01-08 17:19:11 +00:00
James Henderson	d68904f957	[NFC] Fix trivial typos in comments Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.	2020-01-06 10:50:26 +00:00
Fangrui Song	586acd8490	[MC] Delete MCSection::{rbegin,rend}	2020-01-05 12:51:15 -08:00
Fangrui Song	c764304adc	[MC] Drop an unused rule about absolute temporary symbols	2020-01-05 11:39:52 -08:00
Philip Reames	14fc20ca62	Align branches within 32-Byte boundary (NOP padding) WARNING: If you're looking at this patch because you're looking for a full performace mitigation of the Intel JCC Erratum, this is not it! This is a preliminary patch on the patch towards mitigating the performance regressions caused by Intel's microcode update for Jump Conditional Code Erratum. For context, see: https://www.intel.com/content/www/us/en/support/articles/000055650.html The patch adds the required assembler infrastructure and command line options needed to exercise the logic for INTERNAL TESTING. These are NOT public flags, and should not be used for anything other than LLVM's own testing/debugging purposes. They are likely to change both in spelling and meaning. WARNING: This patch is knowingly incorrect in some cornercases. We need, and do not yet provide, a mechanism to selective enable/disable the padding. Conversation on this will continue in parellel with work on extending this infrastructure to support prefix padding. The goal here is to have the assembler align specific instructions such that they neither cross or end at a 32 byte boundary. The impacted instructions are: a. Conditional jump. b. Fused conditional jump. c. Unconditional jump. d. Indirect jump. e. Ret. f. Call. The new options for llvm-mc are: -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary. -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align. A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit NOP to align the fused/unfused branch. alignBranchesBegin inserts MCBoundaryAlignFragment before instructions, alignBranchesEnd marks the end of the branch to be aligned, relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch. Nop padding is disabled when the instruction may be rewritten by the linker, such as TLS Call. Process Note: I am landing a patch by skan as it has been LGTMed, and continuing to iterate on the review is simply slowing us down at this point. We can and will continue to iterate in tree. Patch By: skan Differential Revision: https://reviews.llvm.org/D70157	2019-12-20 11:35:50 -08:00
Fangrui Song	9574757dba	[MC] Delete MCCodePadder D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106	2019-12-09 19:21:31 -08:00
Guillaume Chatelet	18f805a7ea	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Guillaume Chatelet	af11cc7eb5	[Alignment] Move OffsetToAlignment to Alignment.h Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67499 llvm-svn: 371742	2019-09-12 15:20:36 +00:00
Hsiangkai Wang	18ccfadd46	[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame. It is necessary to generate fixups in .debug_frame or .eh_frame as relaxation is enabled due to the address delta may be changed after relaxation. There is an opcode with 6-bits data in debug frame encoding. So, we also need 6-bits fixup types. Differential Revision: https://reviews.llvm.org/D58335 llvm-svn: 366524	2019-07-19 02:03:34 +00:00
Hsiangkai Wang	657277e0f1	Revert "[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame." This reverts commit 17e3cbf5fe656483d9016d0ba9e1d0cd8629379e. llvm-svn: 366444	2019-07-18 15:06:50 +00:00
Hsiangkai Wang	e43ce1a958	[DebugInfo] Generate fixups as emitting DWARF .debug_frame/.eh_frame. It is necessary to generate fixups in .debug_frame or .eh_frame as relaxation is enabled due to the address delta may be changed after relaxation. There is an opcode with 6-bits data in debug frame encoding. So, we also need 6-bits fixup types. Differential Revision: https://reviews.llvm.org/D58335 llvm-svn: 366442	2019-07-18 14:47:34 +00:00
Shiva Chen	5af037f1e9	[RISCV] Insert R_RISCV_ALIGN relocation type and Nops for code alignment when linker relaxation enabled Linker relaxation may change code size. We need to fix up the alignment of alignment directive in text section by inserting Nops and R_RISCV_ALIGN relocation type. So then linker could satisfy the alignment by removing Nops. To do this: 1. Add shouldInsertExtraNopBytesForCodeAlign target hook to calculate the Nops we need to insert. 2. Add shouldInsertFixupForCodeAlign target hook to insert R_RISCV_ALIGN fixup type. Differential Revision: https://reviews.llvm.org/D47755 llvm-svn: 352616	2019-01-30 11:16:59 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Alex Lorenz	afa75d7843	[macho] save the SDK version stored in module metadata into the version min and build version load commands in the object file This commit introduces a new metadata node called "SDK Version". It will be set by the frontend to mark the platform SDK (macOS/iOS/etc) version which was used during that particular compilation. This node is used when machine code is emitted, by either saving the SDK version into the appropriate macho load command (version min/build version), or by emitting the assembly for these load commands with the SDK version specified as well. The assembly for both load commands is extended by allowing it to contain the sdk_version X, Y [, Z] trailing directive to represent the SDK version respectively. rdar://45774000 Differential Revision: https://reviews.llvm.org/D55612 llvm-svn: 349119	2018-12-14 01:14:10 +00:00
Hsiangkai Wang	2089e4c8f1	[DebugInfo] Fix build failed in clang-x86_64-linux-selfhost-modules. Only generate symbol difference expression if needed. llvm-svn: 338484	2018-08-01 04:17:41 +00:00
Hsiangkai Wang	5c63af0d04	[DebugInfo] Generate fixups as emitting DWARF .debug_line. It is necessary to generate fixups in .debug_line as relaxation is enabled due to the address delta may be changed after relaxation. DWARF will record the mappings of lines and addresses in .debug_line section. It will encode the information using special opcodes, standard opcodes and extended opcodes in Line Number Program. I use DW_LNS_fixed_advance_pc to encode fixed length address delta and DW_LNE_set_address to encode absolute address to make it possible to generate fixups in .debug_line section. Differential Revision: https://reviews.llvm.org/D46850 llvm-svn: 338477	2018-08-01 02:18:06 +00:00
Fangrui Song	f78650a8de	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} llvm-svn: 338293	2018-07-30 19:41:25 +00:00
Peter Smith	1503fc0fd0	[MC] Move bundling and MCSubtargetInfo to MCEncodedFragment [NFC] Instruction bundling is only supported on descendants of the MCEncodedFragment type. By moving the bundling functionality and MCSubtargetInfo to this class it makes it easier to set and extract the MCSubtargetInfo when it is necessary. This is a refactoring change that will make it easier to pass the MCSubtargetInfo through to writeNops when nop padding is required. Differential Revision: https://reviews.llvm.org/D45959 llvm-svn: 334814	2018-06-15 09:48:18 +00:00
Sam Clegg	8c32e913b5	[MC] Move MCAssembler::dump into the correct cpp file. NFC Differential Revision: https://reviews.llvm.org/D46556 llvm-svn: 334713	2018-06-14 14:04:23 +00:00
Peter Smith	57f661bd7d	[MC] Pass MCSubtargetInfo to fixupNeedsRelaxation and applyFixup On targets like Arm some relaxations may only be performed when certain architectural features are available. As functions can be compiled with differing levels of architectural support we must make a judgement on whether we can relax based on the MCSubtargetInfo for the function. This change passes through the MCSubtargetInfo for the function to fixupNeedsRelaxation so that the decision on whether to relax can be made per function. In this patch, only the ARM backend makes use of this information. We must also pass the MCSubtargetInfo to applyFixup because some fixups skip error checking on the assumption that relaxation has occurred, to prevent code-generation errors applyFixup must see the same MCSubtargetInfo as fixupNeedsRelaxation. Differential Revision: https://reviews.llvm.org/D44928 llvm-svn: 334078	2018-06-06 09:40:06 +00:00

1 2 3 4 5 ...

445 Commits