This gives a pretty substantial size reduction; for a 6.5 MB
DLL with 300 KB .xdata, the .xdata shrinks by 66 KB.
Differential Revision: https://reviews.llvm.org/D87369
Convert 2-byte opcodes to equivalent 1-byte ones.
Adjust the existing exhaustive testcase to avoid being altered by
the simplification rules (to keep that test exercising all individual
opcodes).
Fix the assembler parser limits for register pairs; for .seh_save_regp
and .seh_save_regp_x, we can allow up to x29, for a x29+x30 pair
(which gets remapped to the UOP_SaveFPLR(X) opcodes), for .seh_save_fregp
and .seh_save_fregpx, allow up to d14+d15.
Not creating .seh_save_next for float register pairs, as the
actual unwinder implementation in current versions of Windows is buggy
for that case.
This gives a minimal but measurable size reduction. (For a 6.5 MB
DLL with 300 KB .xdata, the .xdata shrinks by 48 bytes. The opcode
sequences are padded to a 4 byte boundary, so very small improvements
might not end up mattering directly.)
Differential Revision: https://reviews.llvm.org/D87367
MASM allows variables defined by equate statements to be used in expressions.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D86946
MASM aligns fields to the _minimum_ of the STRUCT alignment value and the size of the next field.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D86945
Previous implementations for the TLS models General Dynamic and Initial Exec
were missing the ELF::STT_TLS type on symbols that required the type. This patch
adds the type.
Reviewed By: sfertile, MaskRay
Differential Revision: https://reviews.llvm.org/D86777
Add support in llvm-readobj for displaying them and support in the
asm parsser, AArch64TargetStreamer and MCWin64EH for emitting them.
The directives for the remaining basic opcodes have names that
match the opcode in the documentation.
The directives for custom stack cases, that are named
MSFT_OP_TRAP_FRAME, MSFT_OP_MACHINE_FRAME, MSFT_OP_CONTEXT
and MSFT_OP_CLEAR_UNWOUND_TO_CALL, are given matching assembler
directive names that fit into the rest of the opcode naming;
.seh_trap_frame, .seh_context, .seh_clear_unwound_to_call
The opcode MSFT_OP_MACHINE_FRAME is mapped to the existing
opecode enum UOP_PushMachFrame that is used on x86_64, and also
uses the corresponding existing x86_64 directive name
.seh_pushframe.
Differential Revision: https://reviews.llvm.org/D86889
Add support for line continuations (the "backslash operator") in MASM by modifying the Parser's Lex method.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D83347
This ensures that you get the same output regardless if generating
code directly to an object file or if generating assembly and
assembling that.
Add implementations of the EmitARM64WinCFI*() methods in
AArch64TargetAsmStreamer, and fill in one blank in MCAsmStreamer.
Add corresponding directive handlers in AArch64AsmParser and
COFFAsmParser.
Some SEH directive names have been picked to match the prior art
for SEH assembly directives for x86_64, e.g. the spelling of
".seh_startepilogue" matching the preexisting ".seh_endprologue".
For the directives for saving registers, the exact spelling
from the arm64 documentation is picked, e.g. ".seh_save_reg" (to follow
that naming for all the other ones, e.g. ".seh_save_fregp_x"), while
the corresponding one for x86_64 is plain ".seh_savereg" without the
second underscore.
Directives in the epilogues have the same names as in prologues,
e.g. .seh_savereg, even though the registers are restored, not
saved, at that point.
Differential Revision: https://reviews.llvm.org/D86529
This can happen e.g. for code that declare .seh_proc/.seh_endproc
in assembly, or for code that use .seh_handlerdata (which triggers
the unwind info to be emitted before the end of the function).
The TextSection field must be made non-const to be able to use it
with Streamer.SwitchSection().
Differential Revision: https://reviews.llvm.org/D86528
If there's no unwinding opcodes, omit writing the xdata/pdata records.
Previously, this generated truncated xdata records, and llvm-readobj
would error out when trying to print them.
If writing of an xdata record is forced via the .seh_handlerdata
directive, skip it if there's no info to make a sensible unwind
info structure out of, and clearly error out if such info appeared
later in the process.
Differential Revision: https://reviews.llvm.org/D86527
Summary:
Support TOCU and TOCL relocation type for object file generation.
Reviewed by: DiggerLin
Differential Revision: https://reviews.llvm.org/D84549
This patch is the initial support for the Intial Exec Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D81947
This patch is the initial support for the General Dynamic Thread Local
Local Storage model to produce code sequence and relocations correct
to the ABI for the model when using PC relative memory operations.
Patch by: NeHuang
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D82315
Summary:
This is a follow up for D82481. For .lcomm directive, although it's
not necessary to have .rename emitted, it's still desirable to do
it so that we do not see internal 'Rename..' gets print out in
symbol table. And we could have consistent naming between TC entry
and .lcomm. And also have consistent naming between IR and final
object file.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D86075
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.
This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.
One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.
I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.
Differential Revision: https://reviews.llvm.org/D85165
Define the platform ID = 10, and simple mappings between platform ID & name.
Reviewed By: MaskRay, cishida
Differential Revision: https://reviews.llvm.org/D85594
SUMMARY:
1. in the patch , remove setting storageclass in function .getXCOFFSection and construct function of class MCSectionXCOFF
there are
XCOFF::StorageMappingClass MappingClass;
XCOFF::SymbolType Type;
XCOFF::StorageClass StorageClass;
in the MCSectionXCOFF class,
these attribute only used in the XCOFFObjectWriter, (asm path do not need the StorageClass)
we need get the value of StorageClass, Type,MappingClass before we invoke the getXCOFFSection every time.
actually , we can get the StorageClass of the MCSectionXCOFF from it's delegated symbol.
2. we also change the oprand of branch instruction from symbol name to qualify symbol name.
for example change
bl .foo
extern .foo
to
bl .foo[PR]
extern .foo[PR]
3. and if there is reference indirect call a function bar.
we also add
extern .bar[PR]
Reviewers: Jason liu, Xiangling Liao
Differential Revision: https://reviews.llvm.org/D84765
Adds the binary format goff and the operating system zos to the triple
class. goff is selected as default binary format if zos is choosen as
operating system. No further functionality is added.
Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay
Reviewed By: efriedma, tahonermann, hubert.reinterpertcast
Differential Revision: https://reviews.llvm.org/D82081
Summary:
Use TE SMC instead of TC SMC in large code model mode,
so that large code model TOC entries could get placed after all
the small code model TOC entries, which reduces the chance of TOC overflow.
Reviewed By: Xiangling_L
Differential Revision: https://reviews.llvm.org/D85455
Summary:
AIX assembler does not generate correct relocation when .rename
appear between tc entry label and .tc directive.
So only emit .rename after .tc/.comm or other linkage is emitted.
Reviewed By: daltenty, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D85317
This reverts commit b497665d98.
Spent some time trying to reproduce this locally, reverting in a
desparate attempt to fix the sanitizer buildbot:
- http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/28828
I don't know exactly why or how this patch breaks the bots, but it seems
pretty concrete that it's the culprit.
Adds the function createMCInst() to MCContext that creates a MCInst using
a typed bump alloctor.
MCInst contains a SmallVector<MCOperand, 8>. The SmallVector is POD only
for <= 8 operands. The default untyped bump pointer allocator of MCContext
does not delete the MCInst, so if the SmallVector grows, it's a leak.
This fixes https://bugs.llvm.org/show_bug.cgi?id=46900.
Part of https://bugs.llvm.org/show_bug.cgi?id=41734
LTO can drop externally available definitions. Such AssociatedSymbol is
not associated with a symbol. ELFWriter::writeSection() will assert.
Allow a SHF_LINK_ORDER section to have sh_link=0.
We need to give sh_link a syntax, a literal zero in the linked-to symbol
position, e.g. `.section name,"ao",@progbits,0`
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D72899
This drops a GNU gold workaround and reverts the revert commit rL366708.
Before binutils 2.34, gold -O2 and above did not correctly handle R_386_GOTOFF to
SHF_MERGE|SHF_STRINGS sections: https://sourceware.org/bugzilla/show_bug.cgi?id=16794
From the original review:
... it reduced the size of a big ARM-32 debug image by 33%. It contained ~68M
of relocations symbols out of total ~71M symbols (96% of symbols table was
generated for relocations with symbol).
-Wl,-O2 (and -Wl,-O3) is so rare that we should just lower the
optimization level for LLVM_LINKER_IS_GOLD rather than pessimizing all users.
For comdats (e.g. caused by -ffunction-sections), Section is already
set here; make sure it's null, for the weak external symbol to be undefined.
This fixes PR46779.
Differential Revision: https://reviews.llvm.org/D84507
A linker optimization is available on PowerPC for GOT indirect PCRelative loads.
The idea is that we can mark a usual GOT indirect load:
pld 3, vec@got@pcrel(0), 1
lwa 3, 4(3)
With a relocation to say that if we don't need to go through the GOT we can let
the linker further optimize this and replace a load with a nop.
pld 3, vec@got@pcrel(0), 1
.Lpcrel1:
.reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8)
lwa 3, 4(3)
This patch adds the logic that allows the compiler to add the R_PPC64_PCREL_OPT.
Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D79864
When the compiler generates a GOT indirect load it must generate two loads. One
that loads the address of the element from the GOT and a second to load the
actual element based on the address just loaded from the GOT. However, the
linker can optimize these two loads into one load if it knows that it is safe
to do so. The compiler can tell the linker that the optimization is safe
by using the R_PPC64_PCREL_OPT relocation.
This patch extends the .reloc directive to allow the following setup
pld 3, vec@got@pcrel(0), 1
.Lpcrel1=.-8
... More instructions possible here ...
.reloc .Lpcrel1,R_PPC64_PCREL_OPT,.-.Lpcrel1
lwa 3, 4(3)
Reviewers: nemanjai, lei, hfinkel, sfertile, efriedma, tstellar, grosbach, MaskRay
Reviewed By: nemanjai, MaskRay
Differential Revision: https://reviews.llvm.org/D79625
PTX does not support negative values in .bNN data directives and we must
typecast such values to unsigned before printing them.
MCAsmInfo can now specify whether such casting is necessary for particular
target.
Differential Revision: https://reviews.llvm.org/D83423
Accounting for the fact that Wasm function indices are 32-bit, but in wasm64 we want uniform 64-bit pointers.
Includes reloc types for 64-bit table indices.
Differential Revision: https://reviews.llvm.org/D83729
For `.reloc offset, *, *`, currently offset can be a constant or symbol.
This patch makes it support any expression which can be folded to sym+constant.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D83751
Replace mutiple `if else` clauses with a `switch` clause and remove redundant checks. Before this patch, we need to add a statement like `if(!isa<MCxxxFragment>(Frag)) ` here each time we add a new kind of `MCEncodedFragment` even if it has no fixups. After this patch, we don't need to do that.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D83366
Summary:
Add support for user-defined types to MasmParser, including initialization and field access.
Known issues:
- Omitted entry initializers (e.g., <,0>) do not work consistently for nested structs/arrays.
- Size checking/inference for values with known types is not yet implemented.
- Some ml64.exe syntaxes for accessing STRUCT fields are not recognized.
- `[<register>.<struct name>].<field>`
- `[<register>[<struct name>.<field>]]`
- `(<struct name> PTR [<register>]).<field>`
- `[<variable>.<struct name>].<field>`
- `(<struct name> PTR <variable>).<field>`
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D75306
Summary:
Change MCExpr to support Aurora VE's modifiers. Change asmparser to use
existing MCExpr parser (parseExpression) to parse an expression contining
symbols with modifiers and offsets. Also add several regression tests
of MC layer.
Reviewers: simoll, k-ishizaka
Reviewed By: simoll
Subscribers: hiraditya, llvm-commits
Tags: #llvm, #ve
Differential Revision: https://reviews.llvm.org/D83170
Summary:
When a desired symbol name contains invalid character that the
system assembler could not process, we need to emit .rename
directive in assembly path in order for that desired symbol name
to appear in the symbol table.
Reviewed By: hubert.reinterpretcast, DiggerLin, daltenty, Xiangling_L
Differential Revision: https://reviews.llvm.org/D82481
Summary:
When parsing 64-bit MASM, treat memory operands with unspecified base register as RIP-based.
Documented in several places, including https://software.intel.com/en-us/articles/introduction-to-x64-assembly: "Unfortunately, MASM does not allow this form of opcode, but other assemblers like FASM and YASM do. Instead, MASM embeds RIP-relative addressing implicitly."
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D73227
This change lets LLVM use the LC_BUILD_VERSION command when building for macOS 10.14, iOS 12, tvOS 12, and watchOS 5.
Additionally, this change ensures that new platforms like Apple Silicon macOS / Mac Catalyst,
and simulators running on Apple Silicon alway use LC_BUILD_VERSION with the OS version set to the
minimum supported OS version if the deployment target version is older.
Differential Revision: https://reviews.llvm.org/D82836
Give up folding an expression if the fragment of one of the operands
would require laying out a fragment already being laid out. This
prevents hitting an infinite recursion when a fill size expression
refers to a later fragment since computing the offset of that fragment
would require laying out the fill fragment and thus computing its size
expression.
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D79570
Currently, section indices may be passed uninitialized by value if
writing the section fails. Removes section indices form class
initialization and returns them from the write{Code,Data}Section
function calls instead.
Patch by Gui Andrade!
Differential Revision: https://reviews.llvm.org/D81702
This features matches ELFAsmParser and makes it possible to use `.section ".llvm.call-graph-profile","n"`
Reviewed By: zequanwu
Differential Revision: https://reviews.llvm.org/D82240
The patch renames MakeStartMinusEndExpr() to makeEndMinusStartExpr() to
better reflect an expression it creates and fix a naming style issue.
Differential Revision: https://reviews.llvm.org/D82079
Compiling assembly files when newlines are reduced to line markers within a `.macro` context will generate wrong information in `.debug_line` section.
This patch fixes this issue by evaluating line markers within the macro scope but not when they are used and evaluated.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D80381
Note that .eh_frame sections are generated in the 32-bit format even
when debug sections are 64-bit, for compatibility reasons. They use
relative references between entries, so they hardly benefit from the
64-bit format.
Differential Revision: https://reviews.llvm.org/D81149
DW_FORM_sec_offset was introduced in DWARFv4, so, for 64-bit DWARFv3,
DW_FORM_data8 should be used instead.
Differential Revision: https://reviews.llvm.org/D81148
The patch enables producing DWARF64 compilation units and fixes
generating references to .debug_abbrev and .debug_line sections.
A similar change for .debug_ranges/.debug_rnglists will be added
in a forthcoming patch.
Differential Revision: https://reviews.llvm.org/D81145
The patch adds an option `--dwarf64` to instruct a tool to generate
debug information in the 64-bit DWARF format. There is no real
implementation yet, only a few compatibility checks.
Differential Revision: https://reviews.llvm.org/D81143
This adds 4 new reloc types.
A lot of code that previously assumed any memory or offset values could be contained in a uint32_t (and often truncated results from functions returning 64-bit values) have been upgraded to uint64_t. This is not comprehensive: it is only the values that come in contact with the new relocation values and their dependents.
A new tablegen mapping was added to automatically upgrade loads/stores in the assembler, which otherwise has no way to select for these instructions (since they are indentical other than for the offset immediate). It follows a similar technique to https://reviews.llvm.org/D53307
Differential Revision: https://reviews.llvm.org/D81704
Summary:
This commit slightly modifies the MCDisassembler, and llvm-objdump to
allow targets to also decode entire symbols.
WebAssembly uses the onSymbolStart hook it to decode preludes.
WebAssembly partially disassembles the symbol in its target specific
way; and then falls back to the normal flow of llvm-objdump.
AMDGPU needs it to decode kernel descriptors entirely, and move to the
next symbol.
This commit is to split the above task into 2.
- Changes to llvm-objdump and MC-layer without breaking WebAssembly code
[ this commit ]
- AMDGPU's implementation of onSymbolStart that decodes kernel
descriptors. [ https://reviews.llvm.org/D80713 ]
Reviewers: scott.linder, t-tye, sunfish, arsenm, jhenderson, MaskRay, aardappel
Reviewed By: scott.linder, jhenderson, aardappel
Subscribers: bcain, dschuff, wdng, tpr, sbc100, jgravelle-google, hiraditya, aheejin, MaskRay, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80512
SUMMARY:
Since we deal with aix emitLinkage in the PPCAIXAsmPrinter::emitLinkage() in the patch https://reviews.llvm.org/D75866. It do not go to AsmPrinter::emitLinkage() any more, we clean up some aix related code in the AsmPrinter::emitLinkage()
Reviewers: Jason liu
Differential Revision: https://reviews.llvm.org/D81613
SUMMARY:
in the aix assembly , it do not have .hidden and .protected directive.
in current llvm. if a function or a variable which has visibility attribute, it will generate something like the .hidden or .protected , it can not recognize by aix as.
in aix assembly, the visibility attribute are support in the pseudo-op like
.extern Name [ , Visibility ]
.globl Name [, Visibility ]
.weak Name [, Visibility ]
in this patch, we implement the visibility attribute for the global variable, function or extern function .
for example.
extern __attribute__ ((visibility ("hidden"))) int
bar(int* ip);
__attribute__ ((visibility ("hidden"))) int b = 0;
__attribute__ ((visibility ("hidden"))) int
foo(int* ip){
return (*ip)++;
}
the visibility of .comm linkage do not support , we will have a separate patch for it.
we have the unsupported cases ("default" and "internal") , we will implement them in a a separate patch for it.
Reviewers: Jason Liu ,hubert.reinterpretcast,James Henderson
Differential Revision: https://reviews.llvm.org/D75866
If there are more than 65534 relocation entries in a single section,
we should generate an overflow section.
Since we don't support overflow section for now, we should generate
an error.
Differential revision: https://reviews.llvm.org/D81104
Without this change, names start with 'L' will get created as
temporary symbol in MCContext::createSymbol.
Some other potential prefix considered:
.L, does not work for AIX, as a function start with L will end
up with .L as prefix for its function entry point.
..L could work, but it does not play well with the convention
on AIX that anything start with '.' are considered as entry point.
L. could work, but not sure if it's safe enough, as it's possible
to have suffixes like .something append to a plain L, giving L.something
which is not necessarily a temporary.
That's why we picked L.. for now.
Differential Revision: https://reviews.llvm.org/D80831
Summary:
The standard data emission directives (e.g. .short, .long) in the AIX assembler
have the unintended consequence of aligning their output to the natural byte
boundary. This cause problems because we aren't expecting behavior from the
Data*bitsDirectives, so the final alignment of data isn't correct in some cases
on AIX.
This patch updated the Data*bitsDirectives to use .vbyte pseudo-ops instead to emit the
data, since we will emit the .align directives as needed. We update the existing
testcases and add a test for emission of struct data.
Reviewers: hubert.reinterpretcast, Xiangling_L, jasonliu
Reviewed By: hubert.reinterpretcast, jasonliu
Subscribers: wuzish, nemanjai, hiraditya, kbarton, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80934
This patch adds support for emission of following DWARFv5 macro
forms in .debug_macro.dwo section:
- DW_MACRO_start_file
- DW_MACRO_end_file
- DW_MACRO_define_strx
- DW_MACRO_undef_strx
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D78866
Previously in the object format we punted on this and simply wrote
zeros (and didn't include the function in the elem segment). With
this change we write a meaningful value which is the segment
relative table index of the associated function.
This matches the that wasm-ld produces in `-r` mode. This inconsistency
between the output the MC object writer and the wasm-ld object
writer could cause warnings to be emitted when reading back in the
output of `wasm-ld -r`. See:
https://github.com/emscripten-core/emscripten/issues/11217
This only applies to this one relocation type which is only generated
when compiling in PIC mode.
Differential Revision: https://reviews.llvm.org/D80774
SUMMARY:
when there are two symbol has the same address. llvm-objdump -D -symbol-description will select symbol based on the following rule:
1. using Label first if there is a Label symbol.
2. If there is not Label, using a symbol which has Storage Mapping class.
3. if more than one symbol has storage mapping class, put the TC0 has the low priority, for other storage mapping class , compare based on the value.
Reviewers: James Henderson ,hubert.reinterpretcast,
Differential Revision: https://reviews.llvm.org/D78387
Negations are incorrectly added in numerous places and the code just happens to work.
Also fix a missed DW_CFA_def_cfa_offset negation in c693b9c321d5a40d012340619674cf790c9ac86c:
ARMAsmBackendDarwin::generateCompactUnwindEncoding
To be consistent with other directives like '.comm', '.lcomm', we remove
the spaces after the comma for '.csect' on AIX.
Differential Revision: https://reviews.llvm.org/D80247
The function does not need an MCStreamer per se; it was used only to get
access to the MCContext.
Differential Revision: https://reviews.llvm.org/D80205
In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options.
However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests.
This patch restores those replace_path_prefix calls.
It also modifies the prefix matching to be case-insensitive under Windows.
Differential Revision : https://reviews.llvm.org/D76869
Summary:
This patch tries to emit the correct alignment result for both
object file generation path and assembly path.
Reviewed by: hubert.reinterpretcast, DiggerLin, daltenty
Differential Revision: https://reviews.llvm.org/D79127
Summary:
This patch introduces the constants defined to identify DWARF sections
in XCOFF into `llvm/BinaryFormat/XCOFF.h` and adds (NFC) placeholder
code to `llvm/lib/MC/MCObjectFileInfo.cpp` where the DWARF sections for
XCOFF are to be set up.
Reviewers: jasonliu, sfertile, daltenty, DiggerLin, Xiangling_L
Reviewed By: jasonliu, sfertile, DiggerLin
Differential Revision: https://reviews.llvm.org/D79220
The function MCSymbolRefExpr::getVariantKindForName was missing the entry for
VK_PPC_GOT_PCREL. This patch adds the missing entry.
Differential Revision: https://reviews.llvm.org/D79015
The setting of `MCAsmInfo` properties for XCOFF got split between
`MCAsmInfoXCOFF` and `PPCXCOFFMCAsmInfo`. Except for the properties that
are dependent on the target information being passed via the
constructor, the properties being set in `PPCXCOFFMCAsmInfo` had no
fundamental reason for being treated as specific for XCOFF on PowerPC.
Indeed, the property that might be considered more specific to PowerPC,
`NeedsFunctionDescriptors`, was set in `MCAsmInfoXCOFF`.
XCOFF being specific to PowerPC anyway, this patch consolidates the
setting of the properties into `MCAsmInfoXCOFF` except for the cases
that are dependent on the information provided via the
`PPCXCOFFMCAsmInfo` constructor.
This patch also reorders the assignments to the fields to match the
declaration order in `MCAsmInfo`.
This change add support for defined wasm globals in the .s format,
the MC layer, and wasm-ld
Currently there is no support custom initialization and all wasm
globals are initialized to zero.
Fixes: PR45742
Differential Revision: https://reviews.llvm.org/D79137
The generic implementation is actually specific to x86. It assumes the
offset is relative to the end of the instruction and the immediate is
not scaled (which is false on most RISC).
Summary:
Before this patch, `relaxInstruction` takes three arguments, the first
argument refers to the instruction before relaxation and the third
argument is the output instruction after relaxation. There are two quite
strange things:
1) The first argument's type is `const MCInst &`, the third
argument's type is `MCInst &`, but they may be aliased to the same
variable
2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third
argument is a fresh uninitialized `MCInst` even if `relaxInstruction`
may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a
loop.
In this patch, we drop the thrid argument, and let `relaxInstruction`
directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and
breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon.
Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain
Reviewed By: Razer6, MaskRay, bcain
Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78364
According to DWARF standard, is_stmt is a global flag; when set or cleared it should affect subsequent .loc directives.
However llvm assembler handled is_stmt differently: it forced all locations to have is_stmt=1 unless is_stmt was specified explicitly as 0.
The fix utilizes current DWARF state flags to compute correct is_stmt values.
See https://bugs.llvm.org/show_bug.cgi?id=45529 for a detailed issue description.
Reviewers: arsenm, probinson, enderby
Differential Revision: https://reviews.llvm.org/D78102
Add initial support for PC Relative addressing for global values that
require GOT indirect addressing. This patch adds PCRelative support for
global addresses that may not be known at link time and may require
access through the GOT.
Differential Revision: https://reviews.llvm.org/D76064
Ensure that symbols explicitly* assigned a section name are placed into
a section with a compatible entry size.
This is done by creating multiple sections with the same name** if
incompatible symbols are explicitly given the name of an incompatible
section, whilst:
- Avoiding using uniqued sections where possible (for readability and
to maximize compatibly with assemblers).
- Creating as few SHF_MERGE sections as possible (for efficiency).
Given that each symbol is assigned to a section in a single pass, we
must decide which section each symbol is assigned to without seeing the
properties of all symbols. A stable and easy to understand assignment is
desirable. The following rules facilitate this: The "generic" section
for a given section name will be mergeable if the name is a mergeable
"default" section name (such as .debug_str), a mergeable "implicit"
section name (such as .rodata.str2.2), or MC has already created a
mergeable "generic" section for the given section name (e.g. in response
to a section directive in inline assembly). Otherwise, the "generic"
section for a given name is non-mergeable; and, non-mergeable symbols
are assigned to the "generic" section, while mergeable symbols are
assigned to uniqued sections.
Terminology:
"default" sections are those always created by MC initially, e.g. .text
or .debug_str.
"implicit" sections are those created normally by MC in response to the
symbols that it encounters, i.e. in the absence of an explicit section
name assignment on the symbol, e.g. a function foo might be placed into
a .text.foo section.
"generic" sections are those that are referred to when a unique section
ID is not supplied, e.g. if there are multiple unique .bob sections then
".quad .bob" will reference the generic .bob section. Typically, the
generic section is just the first section of a given name to be created.
Default sections are always generic.
* Typically, section names might be explicitly assigned in source code
using a language extension e.g. a section attribute: _attribute_
((section ("section-name"))) -
https://clang.llvm.org/docs/AttributeReference.html
** I refer to such sections as unique/uniqued sections. In assembly the
", unique," assembly syntax is used to express such sections.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43457.
See https://reviews.llvm.org/D68101 for previous discussions leading to
this patch.
Some minor fixes were required to LLVM's tests, for tests had been using
the old behavior - which allowed for explicitly assigning globals with
incompatible entry sizes to a section.
This fix relies on the ",unique ," assembly feature. This feature is not
available until bintuils version 2.35
(https://sourceware.org/bugzilla/show_bug.cgi?id=25380). If the
integrated assembler is not being used then we avoid using this feature
for compatibility and instead try to place mergeable symbols into
non-mergeable sections or issue an error otherwise.
Differential Revision: https://reviews.llvm.org/D72194
GNU as emits SHT_PROGBITS .eh_frame by default for .cfi_* directives.
We follow x86-64 psABI and use SHT_X86_64_UNWIND for .eh_frame
Don't error for SHT_PROGBITS .eh_frame on x86-64.
This keeps compatibility with `.section .eh_frame,"a",@progbits` in existing assembly files.
See https://groups.google.com/d/msg/x86-64-abi/7sr4E6THl3g/zUU2UPHOAQAJ
for more discussions.
Reviewed By: joerg
Differential Revision: https://reviews.llvm.org/D76151
For `.bss; nop`, MC inappropriately calls abort() (via report_fatal_error()) with a message
`cannot have fixups in virtual section!`
It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData().
Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage
`non-zero initializer found in section '.bss'` (see D4199)
It'd be nice to report the location but so many directives can call emitIntValue()
and it is difficult to track every location.
Note, COFF does not crash because MCAssembler::writeSectionData() is not
called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section.
Note, GNU as' arm64 backend reports ``Error: attempt to store non-zero value in section `.bss'``
for a non-zero .inst but fails to do so for other instructions.
We simply reject all instructions, even if the encoding is all zeros.
The Mach-O counterpart is D48517 (see `test/MC/MachO/zerofill-text.s`)
Reviewed By: rnk, skan
Differential Revision: https://reviews.llvm.org/D78138
I plan to use MCSection::getName() in D78138. Having the function in the base class is also convenient for debugging.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D78251
MCExpr has a bunch of free space that is currently going to waste.
Repurpose it as 24 bits of subclass data, which is enough to reduce
the size of all subclasses by 8 bytes. This gives us some respectable
savings for debuginfo builds. Here are the max-rss reductions for the
fat LTO link step:
kc.link 238MiB 231MiB (-2.82%)
sqlite3.link 258MiB 250MiB (-3.27%)
consumer-typeset.link 152MiB 148MiB (-2.51%)
bullet.link 197MiB 192MiB (-2.30%)
tramp3d-v4.link 578MiB 567MiB (-1.92%)
pairlocalalign.link 92MiB 90MiB (-1.98%)
clamscan.link 230MiB 223MiB (-2.81%)
lencod.link 242MiB 235MiB (-2.67%)
SPASS.link 235MiB 230MiB (-2.23%)
7zip-benchmark.link 450MiB 435MiB (-3.25%)
Differential Revision: https://reviews.llvm.org/D77939
This patch intends to provide relocation support for the expression
contains two unpaired relocatable terms with opposite signs.
Differential Revision: https://reviews.llvm.org/D77424
It can be used to avoid passing the begin and end of a range.
This makes the code shorter and it is consistent with another
wrappers we already have.
Differential revision: https://reviews.llvm.org/D78016
* Reorganize tests and add coverage
* Improve diagnostic testing
* Make assert() tests more relevant
* Rename tests to macro-* or altmacro-*
This is not NFC because a (previously untested) diagnostic message is changed.
Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it. Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`.
Reviewers: LuoYuanke, reames, MaskRay
Reviewed By: MaskRay
Subscribers: MaskRay, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77851
On PowerPC most functions require a valid TOC pointer.
This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.
This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.
Differential Revision: https://reviews.llvm.org/D73664
Summary:
Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization.
The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable".
Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames
Subscribers: hiraditya, llvm-commits, annita.zhang
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76286
Summary:
This patch adds support for emission of following DWARFv5 macro forms
in .debug_macro section.
1. DW_MACRO_start_file
2. DW_MACRO_end_file
3. DW_MACRO_define_strp
4. DW_MACRO_undef_strp.
Reviewed By: dblaikie, ikudrin
Differential Revision: https://reviews.llvm.org/D72828
Summary:
For current architect, we always require setContainingCsect to be
called on every MCSymbol got used in XCOFF context.
This is very hard to achieve because symbols gets created everywhere
and other MCSymbol types(ELF, COFF) do not have similar rules.
It's very easy to miss setting the containing csect, and we would
need to add a lot of XCOFF specialized code around some common code area.
This patch intendeds to do
1. Rely on getFragment().getParent() to get csect from labels.
2. Only use get/setRepresentedCsect (was get/setContainingCsect)
if symbol itself represents a csect.
Reviewers: DiggerLin, hubert.reinterpretcast, daltenty
Differential Revision: https://reviews.llvm.org/D77080
This patch adds
- New arguments to getMinPrefetchStride() to let the target decide on a
per-loop basis if software prefetching should be done even with a stride
within the limit of the hw prefetcher.
- New TTI hook enableWritePrefetching() to let a target do write prefetching
by default (defaults to false).
- In LoopDataPrefetch:
- A search through the whole loop to gather information before emitting any
prefetches. This way the target can get information via new arguments to
getMinPrefetchStride() and emit prefetches more selectively. Collected
information includes: Does the loop have a call, how many memory
accesses, how many of them are strided, how many prefetches will cover
them. This is NFC to before as long as the target does not change its
definition of getMinPrefetchStride().
- If a previous access to the same exact address was 'read', and the
current one is 'write', make it a 'write' prefetch.
- If two accesses that are covered by the same prefetch do not dominate
each other, put the prefetch in a block that dominates both of them.
- If a ConstantMaxTripCount is less than ItersAhead, then skip the loop.
- A SystemZ implementation of getMinPrefetchStride().
Review: Ulrich Weigand, Michael Kruse
Differential Revision: https://reviews.llvm.org/D70228
MC already knows how to emulate the .weak directive (with its ELF
semantics; i.e., an undefined weak symbol resolves to 0, and a defined
weak symbol has lower link precedence than a strong symbol of the same
name) using COFF weak externals. Plumb this through the ASM printer too,
so that definitions marked with __attribute__((weak)) at the language
level (which gets translated to weak linkage at the IR level) have the
corresponding .weak directive emitted. Note that declarations marked
with __attribute__((weak)) at the language level (which translates to
extern_weak at the IR level) already have .weak directives emitted.
Weak*/linkonce* symbols without an associated comdat (in particular, ones
generated with __attribute__((weak)) in C/C++) were earlier emitted as
normal unique globals, as the comdat is required to provide the linkonce
semantics. This change makes sure they are emitted as .weak instead,
allowing other symbols to override them.
Rename the existing coff-weak.ll test to coff-linkonce.ll. I'm not
quite sure what that test covers, since the behavior being tested in it
(the emission of a one_only section) is just a result of passing
-function-sections to llc; the linkonce_odr makes no difference.
Add a new coff-weak.ll which tests the new directive emission.
Based on an previous patch by Shoaib Meenai.
Differential Revision: https://reviews.llvm.org/D44543
Summary:
The current relaxation implementation is not correctly adjusting the
size and offsets of fragements in one section based on changes in size
of another if the layout order of the two happened to be such that the
former was visited before the later. Therefore, we need to invalidate
the fragments in all sections after each iteration of relaxation, and
possibly further relax some of them in the next ieration. This fixes
PR#45190.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76114
MCTargetOptionsCommandFlags.inc and CommandFlags.inc are headers which contain
cl::opt with static storage.
These headers are meant to be incuded by tools to make it easier to parametrize
codegen/mc.
However, these headers are also included in at least two libraries: lldCommon
and handle-llvm. As a result, when creating DYLIB, clang-cpp holds a reference
to the options, and lldCommon holds another reference. Linking the two in a
single executable, as zig does[0], results in a double registration.
This patch explores an other approach: the .inc files are moved to regular
files, and the registration happens on-demand through static declaration of
options in the constructor of a static object.
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1756977#c5
Differential Revision: https://reviews.llvm.org/D75579
alignBranches is X86 specific, change the name in a
more general one since other target can do some state
chang before and after emitting the instruction.
These symbols need to be external (MSVC tools error out if a weak
external points at a symbol that isn't external; this was tried before
but had to be reverted in bc5b7217dc,
and this was originally explicitly fixed in
732eeaf2a9).
If multiple object files have weak symbols with defaults, their
defaults could cause linker errors due to duplicate definitions,
unless the names of the defaults are unique.
GNU binutils handles this by appending the name of another symbol
from the same object file to the name of the default symbol. Try
to implement something similar; before writing the object file,
locate a symbol that should have a unique name and use the name of
that one for making the weak defaults unique.
Differential Revision: https://reviews.llvm.org/D75989
For context, the proposed RISC-V bit manipulation extension has a subset
of instructions which require one of two SubtargetFeatures to be
enabled, 'zbb' or 'zbp', and there is no defined feature which both of
these can imply to use as a constraint either (see comments in D65649).
AssemblerPredicates allow multiple SubtargetFeatures to be declared in
the "AssemblerCondString" field, separated by commas, and this means
that the two features must both be enabled. There is no equivalent to
say that _either_ feature X or feature Y must be enabled, short of
creating a dummy SubtargetFeature for this purpose and having features X
and Y imply the new feature.
To solve the case where X or Y is needed without adding a new feature,
and to better match a typical TableGen style, this replaces the existing
"AssemblerCondString" with a dag "AssemblerCondDag" which represents the
same information. Two operators are defined for use with
AssemblerCondDag, "all_of", which matches the current behaviour, and
"any_of", which adds the new proposed ORing features functionality.
This was originally proposed in the RFC at
http://lists.llvm.org/pipermail/llvm-dev/2020-February/139138.html
Changes to all current backends are mechanical to support the replaced
functionality, and are NFCI.
At this stage, it is illegal to combine features with ands and ors in a
single AssemblerCondDag. I suspect this case is sufficiently rare that
adding more complex changes to support it are unnecessary.
Differential Revision: https://reviews.llvm.org/D74338
Summary:
Currently, a BoundaryAlign fragment may be inserted after the branch
that needs to be aligned to truncate the current fragment, this fragment is
unused at most of time. To avoid that, we can insert a new empty Data
fragment instead. Non-relaxable instruction is usually emitted into Data
fragment, so the inserted empty Data fragment will be reused at a high
possibility.
Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames, LuoYuanke
Subscribers: llvm-commits, dexonsmith, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75438
Summary:
Mixing stackmaps and DWARF in a single file triggers an assertion in
MCMachOStreamer as stackmap sections are emitted in AsmPrinter::emitEndOfAsmFile
after the DWARF sections have already been emitted.
Since it should be safe to emit stackmap sections after DWARF sections this
patch relaxes the assertion to allow that.
Reviewers: aprantl, dblaikie, echristo
Subscribers: hiraditya, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75836
Summary:
Dollar signed prefixed integers were not allowed by the AsmParser to be
used as Identifiers, differing from the GNU assembler behavior.
This patch updates the parsing of Identifiers to consider such cases as
valid, where the identifier string includes the $ prefix itself. As the
Lexer currently splits these occurrences into separate tokens, those
need to be combined by the AsmParser itself.
Reviewers: efriedma, chill
Reviewed By: efriedma
Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75111
```
// clang -c -gdwarf-5 a.s -o a.o
.section .init; ret
.text; ret
```
.debug_info contains DW_AT_ranges and llvm-dwarfdump will report
a verification error because .debug_rnglists does not exist (not
implemented).
This patch generates .debug_rnglists for assembly files.
emitListsTableHeaderStart() in DwarfDebug.cpp can be shared with
MCDwarf.cpp. Because CodeGen depends on MC, I move the function to
MCDwarf.cpp
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D75375
SUMMARY:
1.if there is a gap between the end virtual address of one section and the beginning virtual address of the next section, the XCOFFObjectWriter.cpp will hit a assert.
2.as discussed in the patch https://reviews.llvm.org/D66969,
since implemented the function description. We can output the raw object data for function.
we need to create a test for raw text section content and test section header for xcoff object file.
Reviewer: daltenty,hubert.reinterpretcast,jasonliu
Differential Revision: https://reviews.llvm.org/D71845
Summary:
Currently the boundaryalign fragment caches its size during the process
of layout and then it is relaxed and update the size in each iteration. This
behaviour is unnecessary and ugly.
Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: MaskRay
Subscribers: hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75404
MCObjectStreamer is more suitable to create fragments than
X86AsmBackend, for example, the function getOrCreateDataFragment is
defined in MCObjectStreamer.
Differential Revision: https://reviews.llvm.org/D75351