The function does not need an MCStreamer per se; it was used only to get
access to the MCContext.
Differential Revision: https://reviews.llvm.org/D80205
In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options.
However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests.
This patch restores those replace_path_prefix calls.
It also modifies the prefix matching to be case-insensitive under Windows.
Differential Revision : https://reviews.llvm.org/D76869
Summary:
This patch tries to emit the correct alignment result for both
object file generation path and assembly path.
Reviewed by: hubert.reinterpretcast, DiggerLin, daltenty
Differential Revision: https://reviews.llvm.org/D79127
Summary:
This patch introduces the constants defined to identify DWARF sections
in XCOFF into `llvm/BinaryFormat/XCOFF.h` and adds (NFC) placeholder
code to `llvm/lib/MC/MCObjectFileInfo.cpp` where the DWARF sections for
XCOFF are to be set up.
Reviewers: jasonliu, sfertile, daltenty, DiggerLin, Xiangling_L
Reviewed By: jasonliu, sfertile, DiggerLin
Differential Revision: https://reviews.llvm.org/D79220
The function MCSymbolRefExpr::getVariantKindForName was missing the entry for
VK_PPC_GOT_PCREL. This patch adds the missing entry.
Differential Revision: https://reviews.llvm.org/D79015
The setting of `MCAsmInfo` properties for XCOFF got split between
`MCAsmInfoXCOFF` and `PPCXCOFFMCAsmInfo`. Except for the properties that
are dependent on the target information being passed via the
constructor, the properties being set in `PPCXCOFFMCAsmInfo` had no
fundamental reason for being treated as specific for XCOFF on PowerPC.
Indeed, the property that might be considered more specific to PowerPC,
`NeedsFunctionDescriptors`, was set in `MCAsmInfoXCOFF`.
XCOFF being specific to PowerPC anyway, this patch consolidates the
setting of the properties into `MCAsmInfoXCOFF` except for the cases
that are dependent on the information provided via the
`PPCXCOFFMCAsmInfo` constructor.
This patch also reorders the assignments to the fields to match the
declaration order in `MCAsmInfo`.
This change add support for defined wasm globals in the .s format,
the MC layer, and wasm-ld
Currently there is no support custom initialization and all wasm
globals are initialized to zero.
Fixes: PR45742
Differential Revision: https://reviews.llvm.org/D79137
The generic implementation is actually specific to x86. It assumes the
offset is relative to the end of the instruction and the immediate is
not scaled (which is false on most RISC).
Summary:
Before this patch, `relaxInstruction` takes three arguments, the first
argument refers to the instruction before relaxation and the third
argument is the output instruction after relaxation. There are two quite
strange things:
1) The first argument's type is `const MCInst &`, the third
argument's type is `MCInst &`, but they may be aliased to the same
variable
2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third
argument is a fresh uninitialized `MCInst` even if `relaxInstruction`
may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a
loop.
In this patch, we drop the thrid argument, and let `relaxInstruction`
directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and
breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon.
Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain
Reviewed By: Razer6, MaskRay, bcain
Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78364
According to DWARF standard, is_stmt is a global flag; when set or cleared it should affect subsequent .loc directives.
However llvm assembler handled is_stmt differently: it forced all locations to have is_stmt=1 unless is_stmt was specified explicitly as 0.
The fix utilizes current DWARF state flags to compute correct is_stmt values.
See https://bugs.llvm.org/show_bug.cgi?id=45529 for a detailed issue description.
Reviewers: arsenm, probinson, enderby
Differential Revision: https://reviews.llvm.org/D78102
Add initial support for PC Relative addressing for global values that
require GOT indirect addressing. This patch adds PCRelative support for
global addresses that may not be known at link time and may require
access through the GOT.
Differential Revision: https://reviews.llvm.org/D76064
Ensure that symbols explicitly* assigned a section name are placed into
a section with a compatible entry size.
This is done by creating multiple sections with the same name** if
incompatible symbols are explicitly given the name of an incompatible
section, whilst:
- Avoiding using uniqued sections where possible (for readability and
to maximize compatibly with assemblers).
- Creating as few SHF_MERGE sections as possible (for efficiency).
Given that each symbol is assigned to a section in a single pass, we
must decide which section each symbol is assigned to without seeing the
properties of all symbols. A stable and easy to understand assignment is
desirable. The following rules facilitate this: The "generic" section
for a given section name will be mergeable if the name is a mergeable
"default" section name (such as .debug_str), a mergeable "implicit"
section name (such as .rodata.str2.2), or MC has already created a
mergeable "generic" section for the given section name (e.g. in response
to a section directive in inline assembly). Otherwise, the "generic"
section for a given name is non-mergeable; and, non-mergeable symbols
are assigned to the "generic" section, while mergeable symbols are
assigned to uniqued sections.
Terminology:
"default" sections are those always created by MC initially, e.g. .text
or .debug_str.
"implicit" sections are those created normally by MC in response to the
symbols that it encounters, i.e. in the absence of an explicit section
name assignment on the symbol, e.g. a function foo might be placed into
a .text.foo section.
"generic" sections are those that are referred to when a unique section
ID is not supplied, e.g. if there are multiple unique .bob sections then
".quad .bob" will reference the generic .bob section. Typically, the
generic section is just the first section of a given name to be created.
Default sections are always generic.
* Typically, section names might be explicitly assigned in source code
using a language extension e.g. a section attribute: _attribute_
((section ("section-name"))) -
https://clang.llvm.org/docs/AttributeReference.html
** I refer to such sections as unique/uniqued sections. In assembly the
", unique," assembly syntax is used to express such sections.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43457.
See https://reviews.llvm.org/D68101 for previous discussions leading to
this patch.
Some minor fixes were required to LLVM's tests, for tests had been using
the old behavior - which allowed for explicitly assigning globals with
incompatible entry sizes to a section.
This fix relies on the ",unique ," assembly feature. This feature is not
available until bintuils version 2.35
(https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the
integrated assembler is not being used then we avoid using this feature
for compatibility and instead try to place mergeable symbols into
non-mergeable sections or issue an error otherwise.
Differential Revision: https://reviews.llvm.org/D72194
GNU as emits SHT_PROGBITS .eh_frame by default for .cfi_* directives.
We follow x86-64 psABI and use SHT_X86_64_UNWIND for .eh_frame
Don't error for SHT_PROGBITS .eh_frame on x86-64.
This keeps compatibility with `.section .eh_frame,"a",@progbits` in existing assembly files.
See https://groups.google.com/d/msg/x86-64-abi/7sr4E6THl3g/zUU2UPHOAQAJ
for more discussions.
Reviewed By: joerg
Differential Revision: https://reviews.llvm.org/D76151
For `.bss; nop`, MC inappropriately calls abort() (via report_fatal_error()) with a message
`cannot have fixups in virtual section!`
It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData().
Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage
`non-zero initializer found in section '.bss'` (see D4199)
It'd be nice to report the location but so many directives can call emitIntValue()
and it is difficult to track every location.
Note, COFF does not crash because MCAssembler::writeSectionData() is not
called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section.
Note, GNU as' arm64 backend reports ``Error: attempt to store non-zero value in section `.bss'``
for a non-zero .inst but fails to do so for other instructions.
We simply reject all instructions, even if the encoding is all zeros.
The Mach-O counterpart is D48517 (see `test/MC/MachO/zerofill-text.s`)
Reviewed By: rnk, skan
Differential Revision: https://reviews.llvm.org/D78138
I plan to use MCSection::getName() in D78138. Having the function in the base class is also convenient for debugging.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D78251
MCExpr has a bunch of free space that is currently going to waste.
Repurpose it as 24 bits of subclass data, which is enough to reduce
the size of all subclasses by 8 bytes. This gives us some respectable
savings for debuginfo builds. Here are the max-rss reductions for the
fat LTO link step:
kc.link 238MiB 231MiB (-2.82%)
sqlite3.link 258MiB 250MiB (-3.27%)
consumer-typeset.link 152MiB 148MiB (-2.51%)
bullet.link 197MiB 192MiB (-2.30%)
tramp3d-v4.link 578MiB 567MiB (-1.92%)
pairlocalalign.link 92MiB 90MiB (-1.98%)
clamscan.link 230MiB 223MiB (-2.81%)
lencod.link 242MiB 235MiB (-2.67%)
SPASS.link 235MiB 230MiB (-2.23%)
7zip-benchmark.link 450MiB 435MiB (-3.25%)
Differential Revision: https://reviews.llvm.org/D77939
This patch intends to provide relocation support for the expression
contains two unpaired relocatable terms with opposite signs.
Differential Revision: https://reviews.llvm.org/D77424
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.
Differential revision: https://reviews.llvm.org/D78016
* Reorganize tests and add coverage
* Improve diagnostic testing
* Make assert() tests more relevant
* Rename tests to macro-* or altmacro-*
This is not NFC because a (previously untested) diagnostic message is changed.
Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it. Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`.
Reviewers: LuoYuanke, reames, MaskRay
Reviewed By: MaskRay
Subscribers: MaskRay, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77851
On PowerPC most functions require a valid TOC pointer.
This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.
This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.
Differential Revision: https://reviews.llvm.org/D73664
Summary:
Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization.
The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable".
Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames
Subscribers: hiraditya, llvm-commits, annita.zhang
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76286
Summary:
This patch adds support for emission of following DWARFv5 macro forms
in .debug_macro section.
1. DW_MACRO_start_file
2. DW_MACRO_end_file
3. DW_MACRO_define_strp
4. DW_MACRO_undef_strp.
Reviewed By: dblaikie, ikudrin
Differential Revision: https://reviews.llvm.org/D72828
Summary:
For current architect, we always require setContainingCsect to be
called on every MCSymbol got used in XCOFF context.
This is very hard to achieve because symbols gets created everywhere
and other MCSymbol types(ELF, COFF) do not have similar rules.
It's very easy to miss setting the containing csect, and we would
need to add a lot of XCOFF specialized code around some common code area.
This patch intendeds to do
1. Rely on getFragment().getParent() to get csect from labels.
2. Only use get/setRepresentedCsect (was get/setContainingCsect)
if symbol itself represents a csect.
Reviewers: DiggerLin, hubert.reinterpretcast, daltenty
Differential Revision: https://reviews.llvm.org/D77080
This patch adds
- New arguments to getMinPrefetchStride() to let the target decide on a
per-loop basis if software prefetching should be done even with a stride
within the limit of the hw prefetcher.
- New TTI hook enableWritePrefetching() to let a target do write prefetching
by default (defaults to false).
- In LoopDataPrefetch:
- A search through the whole loop to gather information before emitting any
prefetches. This way the target can get information via new arguments to
getMinPrefetchStride() and emit prefetches more selectively. Collected
information includes: Does the loop have a call, how many memory
accesses, how many of them are strided, how many prefetches will cover
them. This is NFC to before as long as the target does not change its
definition of getMinPrefetchStride().
- If a previous access to the same exact address was 'read', and the
current one is 'write', make it a 'write' prefetch.
- If two accesses that are covered by the same prefetch do not dominate
each other, put the prefetch in a block that dominates both of them.
- If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.
- A SystemZ implementation of getMinPrefetchStride().
Review: Ulrich Weigand, Michael Kruse
Differential Revision: https://reviews.llvm.org/D70228
MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that declarations marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.
Weak*/linkonce* symbols without an associated comdat (in particular, ones
generated with __attribute__((weak)) in C/C++) were earlier emitted as
normal unique globals, as the comdat is required to provide the linkonce
semantics. This change makes sure they are emitted as .weak instead,
allowing other symbols to override them.
Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.
Add a new coff-weak.ll which tests the new directive emission.
Based on an previous patch by Shoaib Meenai.
Differential Revision: https://reviews.llvm.org/D44543
Summary:
The current relaxation implementation is not correctly adjusting the
size and offsets of fragements in one section based on changes in size
of another if the layout order of the two happened to be such that the
former was visited before the later. Therefore, we need to invalidate
the fragments in all sections after each iteration of relaxation, and
possibly further relax some of them in the next ieration. This fixes
PR#45190.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76114
MCTargetOptionsCommandFlags.inc and CommandFlags.inc are headers which contain
cl::opt with static storage.
These headers are meant to be incuded by tools to make it easier to parametrize
codegen/mc.
However, these headers are also included in at least two libraries: lldCommon
and handle-llvm. As a result, when creating DYLIB, clang-cpp holds a reference
to the options, and lldCommon holds another reference. Linking the two in a
single executable, as zig does[0], results in a double registration.
This patch explores an other approach: the .inc files are moved to regular
files, and the registration happens on-demand through static declaration of
options in the constructor of a static object.
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1756977#c5
Differential Revision: https://reviews.llvm.org/D75579
alignBranches is X86 specific, change the name in a
more general one since other target can do some state
chang before and after emitting the instruction.