This patch extends LLVM IR to add metadata that can be used to emit macho files with two build version load commands.
It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that,
which will be set by a future patch in clang.
MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target,
and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native
macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build
compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable
by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support.
Differential Revision: https://reviews.llvm.org/D112189
This patch adds `emitXCOFFSymbolLinkageWithVisibility` to MCNullStreamer to fix llvm_unreachable getting reached when using option `-filetype=null` on AIX.
Reviewed By: DiggerLin
Differential Revision: https://reviews.llvm.org/D114876
Follow up to D92052 and D94072, exposed due to D107707
Many assemblers to permit that only the first .section contains all
the attributes like '.lds_bss,"w",@nobits' and later section only
use the name ('.lds_bss') inheriting those attributes from the first
section. I turned out that the case that Type changed was missed
when implementing it - and D107707 make it much more likely to hit
that issue. That's fixed by this commit.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114717
For relocations against temporary symbols (that don't persist in
the object file), we normally adjust them to reference the start of
the section.
For adrp relocations, the immediate offset from the referenced
symbol is stored in the opcode as the 21 bit signed immediate; this
means that the symbol referenced must be within +/- 1 MB from the
referenced symbol.
Create label symbols with regular intervals (1 MB intervals). For
relocations against temporary symbols, pick the preceding added
offset symbol and make the relocation against that instead of
against the start of the section.
This should fix the root issue behind
https://bugs.llvm.org/show_bug.cgi?id=52378.
Differential Revision: https://reviews.llvm.org/D114340
For tagged-globals, we only need to disable relaxation for globals that
we actually tag. With this patch function pointer relocations, which
we do not instrument, can be relaxed.
This patch also makes tagged-globals work properly with LTO, as
-Wa,-mrelax-relocations=no doesn't work with LTO.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D113220
In a LTO build, the `end_sequence` in debug_line table for each compile unit (CU) points the end of text section which merged all CUs. The `end_sequence` needs to point to the end of each CU's range. This bug often causes invalid `debug_line` table in the final `.dSYM` binary for MachO after running `dsymutil` which tries to compensate an out-of-range address of `end_sequence`.
The fix is to sync the line table termination with the range operations that are already maintained in DwarfDebug. When CU or section changes, or nodebug functions appear or module is finished, the prior pending line table is terminated using the last range label. In the MC path where no range is tracked, the old logic is conservatively used to end the line table using the section end symbol.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D108261
[NFC] This patch fixes URLs containing "master". Old URLs were either broken or
redirecting to the new URL.
Reviewed By: #libc, ldionne, mehdi_amini
Differential Revision: https://reviews.llvm.org/D113186
If a tool wants to introduce new indirections via stubs at link-time in
ORC, it can cause fidelity issues around the address of the function if
some references to the function do not have relocations. This is known
to happen inside the body of the function itself on x86_64 for example,
where a PC-relative address is formed, but without a relocation.
```
_foo:
leaq -7(%rip), %rax ## form pointer to '_foo' without relocation
_bar:
leaq (%rip), %rax ## uses X86_64_RELOC_SIGNED to '_foo'
```
The consequence of introducing a stub for such a function at link time
is that if it forms a pointer to itself without relocation, it will not
have the same value as a pointer from outside the function. If the
function pointer is used as a key, this can cause problems.
This utility provides best-effort support for adding such missing
relocations using MCDisassembler and MCInstrAnalysis to identify the
problematic instructions. Currently it is only implemented for x86_64.
Note: the related issue with call/jump instructions is not handled
here, only forming function pointers.
rdar://83514317
Differential revision: https://reviews.llvm.org/D113038
Emit __clangast in custom section instead of named data segment
to find it while iterating sections.
This could be avoided if all data segements (the wasm sense) were
represented as their own sections (in the llvm sense).
This can be resolved by https://github.com/WebAssembly/tool-conventions/issues/138
And the on-disk hashtable in clangast needs to be aligned by 4 bytes,
so add paddings in name length field in custom section header.
The length of clangast section name can be represented in 1 byte
by leb128, and possible maximum pads are 3 bytes, so the section
name length won't be invalid in theory.
Fixes https://bugs.llvm.org/show_bug.cgi?id=35928
Differential Revision: https://reviews.llvm.org/D74531
This moves the registry higher in the LLVM library dependency stack.
Every client of the target registry needs to link against MC anyway to
actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.
This removes `WasmTagType`. `WasmTagType` contained an attribute and a
signature index:
```
struct WasmTagType {
uint8_t Attribute;
uint32_t SigIndex;
};
```
Currently the attribute field is not used and reserved for future use,
and always 0. And that this class contains `SigIndex` as its property is
a little weird in the place, because the tag type's signature index is
not an inherent property of a tag but rather a reference to another
section that changes after linking. This makes tag handling in the
linker also weird that tag-related methods are taking both `WasmTagType`
and `WasmSignature` even though `WasmTagType` contains a signature
index. This is because the signature index changes in linking so it
doesn't have any info at this point. This instead moves `SigIndex` to
`struct WasmTag` itself, as we did for `struct WasmFunction` in D111104.
In this CL, in lib/MC and lib/Object, this now treats tag types in the
same way as function types. Also in YAML, this removes `struct Tag`,
because now it only contains the tag index. Also tags set `SigIndex` in
`WasmImport` union, as functions do.
I think this makes things simpler and makes tag handling more in line
with function handling. These two shares similar properties in that both
of them have signatures, but they are kind of nominal so having the same
signature doesn't mean they are the same element.
Also a drive-by fix: the reserved 'attirubute' part's encoding changed
from uleb32 to uint8 a while ago. This was fixed in lib/MC and
lib/Object but not in YAML. This doesn't change object files because the
field's value is always 0 and its encoding is the same for the both
encoding.
This is effectively NFC; I didn't mark it as such just because it
changed YAML test results.
Reviewed By: sbc100, tlively
Differential Revision: https://reviews.llvm.org/D111086
- Introduce a skeleton outline for the GOFFAsmParser
- Before instantiating AsmParser/HLASMAsmParser, target specific asm parsers are attempted to be initialized first before proceeding. If it doesn't exist for a particular file type, we report a fatal error.
- This patch allows to properly instantiate the HLASMAsmParser on z/OS, and ensures we can write lit tests and unit tests which will involve the instantiation of asm parsers, without an assert / fatal error.
Reviewed By: uweigand, Kai
Differential Revision: https://reviews.llvm.org/D110730
Add MCDwarfLineStr class to the public API.
Note that MCDwarfLineTableHeader::Emit(), takes MCDwarfLineStr as
an Optional<> parameter making it impossible to use the API if the class
is not publicly defined.
Reviewed By: alexander-shaposhnikov
Differential Revision: https://reviews.llvm.org/D109412
- This patch adds in the GOFFMCAsmInfo interfaces for the z/OS target.
- This patch decouples the previously existing SystemZMCAsmInfo interface for the ELF target and the z/OS target.
- This patch also removes a small test in the SystemZAsmLexerTest.cpp. The reason for this is because, the test is set up for the s390x-ibm-linux (SystemZ ELF triple), and the test checks a function which is overridden only for the z/OS target. The reason we can't change the test to use a z/OS triple outright is because there is still missing support which prevents the successful running of a test (assert in AsmParser.cpp due to missing GOFFAsmParser support)
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D110077
We previously had a limitation that TLS variables could not
be exported (and therefore could also not be imported). This
change removed that limitation.
Differential Revision: https://reviews.llvm.org/D108877
Previously the relocations pointed at the public user facing,
possibly external symbol.
When the function itself is weak, that symbol may be overridden at
link time, pointing at another strong implementation of the same
function instead. In that case, there's two conflicting pdata entries
pointing at the same address, and the wrong unwind info might end up
used.
Both GCC/binutils and MSVC produce pdata pointing at internal static
symbols. (GCC/binutils point at the .text section just as LLVM does
after this change, MSVC points at special label type symbols with the
type IMAGE_SYM_CLASS_LABEL and names like '$LN4'.)
This fixes unwinding through an overridden "operator new" with a
statically linked C++ library in MinGW mode. (Building libc++ with
-ffunction-sections and linking with --gc-sections might avoid the
issue too.)
This makes the produced object files a little less user friendly
to debug, but with other recent improvements for llvm-readobj, the
unwind info debugging experience should be pretty much the same.
Differential Revision: https://reviews.llvm.org/D109651
The call to emitCodeAlignment was missing a STI which is required
after D45962.
emitCodeAlignment has a default parameter of 0 for MaxBytesToEmit.
Explicitly passing 0 here was interpreted as as nullptr for the STI.
This could possibly be avoided by taking STI as a const reference in
emitCodeAlignment.
Differential Revision: https://reviews.llvm.org/D109425
On some architectures such as Arm and X86 the encoding for a nop may
change depending on the subtarget in operation at the time of
encoding. This change replaces the per module MCSubtargetInfo retained
by the targets AsmBackend in favour of passing through the local
MCSubtargetInfo in operation at the time.
On Arm using the architectural NOP instruction can have a performance
benefit on some implementations.
For Arm I've deleted the copy of the AsmBackend's MCSubtargetInfo to
limit the chances of this causing problems in the future. I've not
done this for other targets such as X86 as there is more frequent use
of the MCSubtargetInfo and it looks to be for stable properties that
we would not expect to vary per function.
This change required threading STI through MCNopsFragment and
MCBoundaryAlignFragment.
I've attempted to take into account the in tree experimental backends.
Differential Revision: https://reviews.llvm.org/D45962
In preparation for passing the MCSubtargetInfo (STI) through to writeNops
so that it can use the STI in operation at the time, we need to record the
STI in operation when a MCAlignFragment may write nops as padding. The
STI is currently unused, a further patch will pass it through to
writeNops.
There are many places that can create an MCAlignFragment, in most cases
we can find out the STI in operation at the time. In a few places this
isn't possible as we are in initialisation or finalisation, or are
emitting constant pools. When possible I've tried to find the most
appropriate existing fragment to obtain the STI from, when none is
available use the per module STI.
For constant pools we don't actually need to use EmitCodeAlign as the
constant pools are data anyway so falling through into it via an
executable NOP is no better than falling through into data padding.
This is a prerequisite for D45962 which uses the STI to emit the
appropriate NOP for the STI. Which can differ per fragment.
Note that involves an interface change to InitSections. It is now
called initSections and requires a SubtargetInfo as a parameter.
Differential Revision: https://reviews.llvm.org/D45961
This add support for SjLj using Wasm exception handling instructions:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/exception-handling/Exceptions.md
This does not yet support the mixed use of EH and SjLj within a
function. It will be added in a follow-up CL.
This currently passes all SjLj Emscripten tests for wasm0/1/2/3/s,
except for the below:
- `test_longjmp_standalone`: Uses Node
- `test_dlfcn_longjmp`: Uses NodeRAWFS
- `test_longjmp_throw`: Mixes EH and SjLj
- `test_exceptions_longjmp1`: Mixes EH and SjLj
- `test_exceptions_longjmp2`: Mixes EH and SjLj
- `test_exceptions_longjmp3`: Mixes EH and SjLj
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D108960
sh_info links to a section, therefore SHF_INFO_LINK should be set as GNU as
does. The issue has been benign because linkers kindly combines relocation
sections w/ and w/o the flag.
Currently context strings contain a lot of duplicated function names and that significantly increase the profile size. This change split the context into a series of {name, offset, discriminator} tuples so function names used in the context can be replaced by the index into the name table and that significantly reduce the size consumed by context.
A follow-up improvement made in the compiler and profiling tools is to avoid reconstructing full context strings which is time- and memory- consuming. Instead a context vector of `StringRef` is adopted to represent the full context in all scenarios. As a result, the previous prevalent profile map which was implemented as a `StringRef` is now engineered as an unordered map keyed by `SampleContext`. `SampleContext` is reshaped to using an `ArrayRef` to represent a full context for CS profile. For non-CS profile, it falls back to use `StringRef` to represent a contextless function name. Both the `ArrayRef` and `StringRef` objects are underpinned by real array and string objects that are stored in producer buffers. For compiler, they are maintained by the sample reader. For llvm-profgen, they are maintained in `ProfiledBinary` and `ProfileGenerator`. Full context strings can be generated only in those cases of debugging and printing.
When it comes to profile format, nothing has changed to the text format, though internally CS context is implemented as a vector. Extbinary format is only changed for CS profile, with an additional `SecCSNameTable` section which stores all full contexts logically in the form of `vector<int>`, which each element as an offset points to `SecNameTable`. All occurrences of contexts elsewhere are redirected to using the offset of `SecCSNameTable`.
Testing
This is no-diff change in terms of code quality and profile content (for text profile).
For our internal large service (aka ads), the profile generation is cut to half, with a 20x smaller string-based extbinary format generated.
The compile time of ads is dropped by 25%.
Differential Revision: https://reviews.llvm.org/D107299
This makes sure, that the text section will have a 2-byte alignment, if
the +c extension is enabled.
Reviewed By: MaskRay, luismarques
Differential Revision: https://reviews.llvm.org/D102052
Similar to D97976.
On Linux, most GCC installations are configured with
`--enable-gnu-unique-object` and such GCC emits `@gnu_unique_object` assembly.
The feature is highly controversial and disliked by many folks.
(On glibc DF_1_NODELETE is implicitly enabled and makes dlclose a no-op).
In llvm-project STB_GNU_UNIQUE is assembly only. Clang does not use STB_GNU_UNIQUE.
Use ELFOSABI_GNU to match GNU as behavior and avoid collision with other
OSABI binding values.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D107861
Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=47983
The AsmLexer currently has an issue with lexing line comments in files
with CRLF line endings, in which it reads the carriage return as being
part of the line comment. This causes an error for certain valid comment
layouts; this patch fixes this by excluding the carriage return from the
line comment.
Differential Revision: https://reviews.llvm.org/D90234
Pseudo probe descriptors are created very early in the pipeline where function names just come from the front end and are not yet decorated. So calling getCanonicalFnName on the function names in probe desc is basically a no-op, which also addes a depenency from MC to ProfileData unnessesarily.
Reviewed By: wenlei, wlei
Differential Revision: https://reviews.llvm.org/D107838
MIPS .debug_* sections should have SHT_MIPS_DWARF section type to
distinguish among sections contain DWARF and ECOFF debug formats, but in
assembly files these sections have SHT_PROGBITS (@progbits) type. Now
assembler shows 'changed section type for ...' error when parsing
`.section .debug_*,"",@progbits` directive for MIPS targets.
The same problem exists for x86-64 target and this patch extends
workaround implemented in D76151. The patch adds one more case
when assembler ignores section types mismatch after `SwitchSection()`
call.
Differential Revision: https://reviews.llvm.org/D107707
Fixes issue where late materialized constants can be more strictly
aligned then their containing csect.
Differential Revision: https://reviews.llvm.org/D103103
D106861 added usage of PseudoProbeAttributes::Reserved as TailCall however this usage hasn't been committed/reviewed. Removing this usage.
Testing
ninja check-all
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D107514
Migrate pseudo probe decoding logic in llvm-profgen to MC, so other LLVM-base program could reuse existing codes. Redesign object layout of encoded and decoded pseudo probes.
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D106861
Previously we would emit constant pool entries for ldr inline asm at the
very end of AsmPrinter::doFinalization(). However, if we're emitting
dwarf aranges, that would end all sections with aranges. Then if we have
constant pool entries to be emitted in those same sections, we'd hit an
assert that the section has already been ended.
We want to emit constant pool entries before emitting dwarf aranges.
This patch splits out arm32/64's constant pool entry emission into its
own MCTargetStreamer virtual method.
Fixes PR51208
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107314
- This patch consists of the bare basic code needed in order to generate some assembly for the z/OS target.
- Only the .text and the .bss sections are added for now.
- The relevant MCSectionGOFF/Symbol interfaces have been added. This enables us to print out the GOFF machine code sections.
- This patch enables us to add simple lit tests wherever possible, and contribute to the testing coverage for the z/OS target
- Further improvements and additions will be made in future patches.
Reviewed By: tmatheson
Differential Revision: https://reviews.llvm.org/D106380
We previously had issues identifying macros not registered with a lowercase name.
Reviewed By: mstorsjo, thakis
Differential Revision: https://reviews.llvm.org/D106453
Add support for all built-in text macros supported by ML64:
@Date, @Time, @FileName, @FileCur, and @CurSeg.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D104965
Support @Version and @Line as built-in symbols. For now, resolves @Version to 1427 (the same as for the VS 2019 release of ML.EXE).
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D104964
Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup
kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data
sections). Usually this is done in a convoluted way, with unnamed temp data
symbols which target the start of the function, in which case
WasmObjectWriter::recordRelocation converts it to use the section symbol
instead. However in some cases the function can actually be undefined; in this
case the dwarf generator uses the function symbol (a named undefined function
symbol) instead. In that case the section-symbol transform doesn't work and we
need to generate the correct reloc type a different way. In this change
WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into
account to choose the correct reloc type.
Fixes PR50408
Differential Revision: https://reviews.llvm.org/D103557
No implementation uses the `LocCookie` parameter at all. Errors are
reported from inside that function by `llvm::SourceMgr`, and the
instance of that at the clang call site arranges to pass the error
messages back to a `ClangAsmParserCallback`, which is where the clang
SourceLocation for the error is computed.
(This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
But this particular change seems beneficial in its own right.)
Reviewed By: miyuki
Differential Revision: https://reviews.llvm.org/D105490
This to protect against non-sensical instruction sequences being assembled,
which would either cause asserts/crashes further down, or a Wasm module being output that doesn't validate.
Unlike a validator, this type checker is able to give type-errors as part of the parsing process, which makes the assembler much friendlier to be used by humans writing manual input.
Because the MC system is single pass (instructions aren't even stored in MC format, they are directly output) the type checker has to be single pass as well, which means that from now on .globaltype and .functype decls must come before their use. An extra pass is added to Codegen to collect information for this purpose, since AsmPrinter is normally single pass / streaming as well, and would otherwise generate this information on the fly.
A `-no-type-check` flag was added to llvm-mc (and any other tools that take asm input) that surpresses type errors, as a quick escape hatch for tests that were not intended to be type correct.
This is a first version of the type checker that ignores control flow, i.e. it checks that types are correct along the linear path, but not the branch path. This will still catch most errors. Branch checking could be added in the future.
Differential Revision: https://reviews.llvm.org/D104945
C++23 will make these conversions ambiguous - so fix them to make the
codebase forward-compatible with C++23 (& a follow-up change I've made
will make this ambiguous/invalid even in <C++23 so we don't regress
this & it generally improves the code anyway)
In MASM, the ifdef family of directives treats its argument literally, without expanding it as a text macro. Add support for this, and also replace the special handling that was previously used for echo.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D104196
Add a flag so that target can choose to use AsmParser for parsing inline asm.
And set the flag by default for AIX.
-no-intergrated-as will override this default if specified explicitly.
Reviewed By: #powerpc, shchenz
Differential Revision: https://reviews.llvm.org/D105314
Implement XCOFFMCAsmParser so that we can use MC to parse inline asm.
The directives and storage mapping classes will be added later
iteratively.
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D105259
Enable the emission of a GNU attributes section by reusing the code for
emitting the ARM build attributes section.
The GNU attributes follow the exact same section format as the ARM
BuildAttributes section, so this can be factored out and reused for GNU
attributes generally.
The immediate motivation for this is to emit a GNU attributes section for the
vector ABI on SystemZ (https://reviews.llvm.org/D105067).
Review: Logan Chien, Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D102894
- In the caller of the overridden `parseStatement` function (i.e. the `AsmParser::Run()`) in the case of an error **and** if we're not at the start of the statement, we "eat" up until the end of the current statement, so we don't have to process it again.
- However, in the HLASMAsmParser class what's happening is that, if an error occurs at the very start of the statement (for example, you invoke the HLASMAsmParser to parse a gnu directive), we will error out, but we never really progress in terms of the next token in the statement to parse. We simply keep looping processing the same error over and over again (partly because we're at the start of the statement)
- To remedy this, when the `parseAsHLASMLabel` function fails, before returning, we "eat" until the end of the statement function, so we don't process it anymore.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D104869
Match ML.EXE's behavior for ALIGN, EVEN, and ORG directives both at file level and in STRUCTs.
We currently reject negative offsets passed to ORG inside STRUCTs (in ML.EXE and ML64.EXE, they wrap around as for an unsigned 32-bit integer).
Also, if a STRUCT is declared using an ORG directive, no value of that type can be defined.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D92507
... even on targets preferring RELA. The section is only consumed by ld.lld
which can handle REL.
Follow-up to D104080 as I explained in the review. There are two advantages:
* The D104080 code only handles RELA, so arm/i386/mips32 etc may warn for -fprofile-use=/-fprofile-sample-use= usage.
* Decrease object file size for RELA targets
While here, change the relocation to relocate weights, instead of 0,1,2,3,..
I failed to catch the issue during review.
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it.
- However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`)
- To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D104715
Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index".
This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied.
With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct.
Doing a quick experiment with clang-13.
The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D104080
This changes the encoding of the `attribute` field, which currently only
contains the value `0` denoting this tag is for an exception, from
`varuint32` to `uint8`. This field is effectively unused at the moment
and reserved for future use, and it is not likely to need `varuint32`
even in future.
See https://github.com/WebAssembly/exception-handling/pull/162.
This does not change any encoded binaries because `0` is encoded in the
same way both in `varuint32` and `uint8`.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104571
This re-architects the RISCV relocation handling to bring the
implementation closer in line with the implementation in binutils. We
would previously aggressively resolve the relocation. With this
restructuring, we always will emit a paired relocation for any symbolic
difference of the type of S±T[±C] where S and T are labels and C is a
constant.
GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE`
which indicates that a fixup may be expanded into multiple relocations.
This is used by the RISCV backend to always emit a paired relocation -
either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] +
SUB[WIDTH] for a debug info relocation. Irrespective of whether linker
relaxation support is enabled, symbolic difference is always emitted as
a paired relocation.
This change also sinks the target specific behaviour down into the
target specific area rather than exposing it to the shared relocation
handling. In the process, we also sink the "special" handling for debug
information down into the RISCV target. Although this improves the path
for the other targets, this is not necessarily entirely ideal either.
The changes in the debug info emission could be done through another
type of hook as this functionality would be required by any other target
which wishes to do linker relaxation. However, as there are no other
targets in LLVM which currently do this, this is a reasonable thing to
do until such time as the code needs to be shared.
Improve the handling of the relocation (and add a reduced test case from
the Linux kernel) to ensure that we handle complex expressions for
symbolic difference. This ensures that we correct relocate symbols with
the adddends normalized and associated with the addition portion of the
paired relocation.
This change also addresses some review comments from Alex Bradbury about
the relocations meant for use in the DWARF CFA being named incorrectly
(using ADD6 instead of SET6) in the original change which introduced the
relocation type.
This resolves the issues with the symbolic difference emission
sufficiently to enable building the Linux kernel with clang+IAS+lld
(without linker relaxation).
Resolves PR50153, PR50156!
Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143
Reviewed By: nickdesaulniers, maskray
Differential Revision: https://reviews.llvm.org/D103539
If a macro is defined on the command line and then overridden in the source code, this is likely to be an error in the user's build system. We should warn on this.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D104008
MASM specifies that all variable definitions are redefinable, except for EQU definitions to expressions. (TEXTEQU is unspecified, but appears to be fully redefinable as well.)
Also, in practice, ML.EXE allows redefinitions where the value doesn't change.
Make variable redefinition possible for text macros, suppressing expansion if written as the first argument to an EQU or TEXTEQU directive.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D103993
Summary: Some structs like FileHeader32/SectionHeader32
defined in llvm/include/llvm/BinaryFormat/XCOFF.h seem
unnecessary, because we only need their size. So this
patch removes them and defines size constants directly.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D103901
This is a followup to D103422. The DenseMapInfo implementations for
ArrayRef and StringRef are moved into the ArrayRef.h and StringRef.h
headers, which means that these two headers no longer need to be
included by DenseMapInfo.h.
This required adding a few additional includes, as many files were
relying on various things pulled in by ArrayRef.h.
Differential Revision: https://reviews.llvm.org/D103491
- A lot of lit tests simply specify the arch minus the triple. On z/OS, this could result in a scenario of some-other-triple-unknown-ibm-zos. This points to an incorrect triple + arch combo.
- To prevent this, isOSzOS change is switched in favour of isOSBinFormatGOFF.
- This is because, the GOFF format is set only if the triple is systemz and if the operating system is GOFF. And currently, there are no other architectures/os's using the GOFF file format.
- An argument could be made that the problematic tests be fixed to explicitly specify the arch-vendor-triple string, but there's a large number of these tests, and adding this stricter scope ensures that we aren't instantiating the incorrect instance of the AsmParser for other platforms when run on z/OS.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103343
- This patch is the second (and hopefully final) part of providing HLASM syntax for inline asm statements for z/OS to LLVM (continuing on from https://reviews.llvm.org/D98276)
- This second part deals with providing label support
- As mentioned in https://reviews.llvm.org/D98276, if the first token is not a space we process the first token as a label, and the remaining tokens as a possible machine instruction
- To achieve this, a new `parseAsHLASMLabel` function is introduced. This function processes the first token, validates whether it is an "acceptable" label according to HLASM standards, and then emits it
- After handling and emitting the label, call the `parseAsMachineInstruction` instruction to process the remaining tokens as a machine instruction.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D103320
Global values imply flags such as readable, writable, executable for the
sections that they will be placed in. Currently MC places all such
entries into the same section, using the first set of flags seen. This
can lead to situations in LTO where a writable global is placed in the
same named section as a readable global from another file, and the
section may not be marked writable.
D72194 ensures that mergeable globals with explicit sections are placed
in separate sections with compatible entry size, by emitting the
`unique` assembly syntax where appropriate. This change extends that
approach to include section flags, so that globals with different
section flags are emitted in separate unique sections.
Differential revision: https://reviews.llvm.org/D100944
This makes it possible for targets to define their own MCObjectFileInfo.
This MCObjectFileInfo is then used to determine things like section alignment.
This is a follow up to D101462 and prepares for the RISCV backend defining the
text section alignment depending on the enabled extensions.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101921
.byte supports string, so if the whole byte list are printable,
we can actually print the string for readability and LIT tests maintainence.
.byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d
->
.byte "Hello, world"
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102814
__table_base is know 64-bit, since in LLVM it represents a function pointer offset
__table_base32 is a copy in wasm32 for use in elem init expr, since no truncation may be used there.
New reloc R_WASM_TABLE_INDEX_REL_SLEB64 added
Differential Revision: https://reviews.llvm.org/D101784
- This patch (is one in a series of patches) which introduces HLASM Parser support (for the first parameter of inline asm statements) to LLVM ([[ https://lists.llvm.org/pipermail/llvm-dev/2021-January/147686.html | main RFC here ]])
- This patch in particular introduces HLASM Parser support for Z machine instructions.
- The approach taken here was to subclass `AsmParser`, and make various functions and variables as "protected" wherever appropriate.
- The `HLASMAsmParser` class overrides the `parseStatement` function. Two new private functions `parseAsHLASMLabel` and `parseAsMachineInstruction` are introduced as well.
The general syntax is laid out as follows (more information available in [[ https://www.ibm.com/support/knowledgecenter/SSENW6_1.6.0/com.ibm.hlasm.v1r6.asm/asmr1023.pdf | HLASM V1R6 Language Reference Manual ]] - Chapter 2 - Instruction Statement Format):
```
<TokA><spaces.*><TokB><spaces.*><TokC><spaces.*><TokD>
```
1. TokA is referred to as the Name Entry. This token is optional
2. TokB is referred to as the Operation Entry. This token is mandatory.
3. TokC is referred to as the Operand Entry. This token is mandatory
4. TokD is referred to as the Remarks Entry. This token is optional
- If TokA is provided, then we either parse TokA as a possible comment or as a label (Name Entry), Tok B as the Operation Entry and so on.
- If TokA is not provided (i.e. we have one or more spaces and then the first token), then we will parse the first token (i.e TokB) as a possible Z machine instruction, TokC as the operands to the Z machine instruction and TokD as a possible Remark field
- TokC (Operand Entry), no spaces are allowed between OperandEntries. If a space occurs it is classified as an error.
- TokD if provided is taken as is, and emitted as a comment.
The following additional approach was examined, but not taken:
- Adding custom private only functions to base AsmParser class, and only invoking them for z/OS. While this would eliminate the need for another child class, these private functions would be of non-use to every other target. Similarly, adding any pure virtual functions to the base MCAsmParser class and overriding them in AsmParser would also have the same disadvantage.
Testing:
- This patch doesn't have tests added with it, for the sole reason that MCStreamer Support and Object File support hasn't been added for the z/OS target (yet). Hence, it's not possible generate code outright for the z/OS target. They are in the process of being committed / process of being worked on.
- Any comments / feedback on how to combat this "lack of testing" due to other missing required features is appreciated.
Reviewed By: Kai, uweigand
Differential Revision: https://reviews.llvm.org/D98276
Adds the ability to pass MCRegisterInfo to dump_pretty and to the print functions,
so that if present, target specific enums names are printed instead of enum values.
This matches how they are defined on X86.
This should fix the relative lookup tables pass for COFF, allowing
it to be reenabled.
Differential Revision: https://reviews.llvm.org/D102217
This change was originally landed in: 5000a1b4b9
It was reverted in: 061e071d8c
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
- Add branch absolute reloction R_RBA, R_TLS relocation for the variable offset
for the tlsgd model and R_TLSM for the region handle for the tlsgd model
- Properly set the relocation fixed values for R_TLS and R_TLSM
- Emit the TCEntry with the variant kind in the XCOFFStreamer
Reviewed by: sfertile, nemanjai, DiggerLin
Differential Revision: https://reviews.llvm.org/D100214
This untangles the MCContext and the MCObjectFileInfo. There is a circular
dependency between MCContext and MCObjectFileInfo. Currently this dependency
also exists during construction: You can't contruct a MOFI without a MCContext
without constructing the MCContext with a dummy version of that MOFI first.
This removes this dependency during construction. In a perfect world,
MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the
MCContext, like other MC information. This is future work.
This also shifts/adds more information to the MCContext making it more
available to the different targets. Namely:
- TargetTriple
- ObjectFileType
- SubtargetInfo
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101462
- As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information)
- This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set
- This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings`
Reviewed By: abhina.sreeskantharajan, Kai
Differential Revision: https://reviews.llvm.org/D101660