Summary:
As proposed in https://github.com/WebAssembly/simd/pull/122. Since
these instructions are not yet merged to the SIMD spec proposal, this
patch makes them entirely opt-in by surfacing them only through LLVM
intrinsics and clang builtins. If these instructions are made
official, these intrinsics and builtins should be replaced with simple
instruction patterns.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D79742
This does the same thing as {vex2}. Which is give an error
if the instruction can't be done with VEX. It doesn't force
the instruction to use 2 byte VEX. That's already the preference
if its possible. Therefore {vex} is a clearer name.
Neither gcc or icc support this. Split out from D79472. I want
to remove more, but it looks like icc does support some things
gcc doesn't and I need to double check our internal test suites.
This patch adds more constant materialization tests, focusing on cases where
we could improve our materialization instruction sequences (particularly for
RV64). Various of these cases will be improved upon in follow-up patches.
Differential Revision: https://reviews.llvm.org/D79453
The function MCSymbolRefExpr::getVariantKindForName was missing the entry for
VK_PPC_GOT_PCREL. This patch adds the missing entry.
Differential Revision: https://reviews.llvm.org/D79015
Summary:
The RISC-V debug register was named dscratch in a previous draft of the RISC-V
debug mode spec. The number of registers has been increased to 2 in the latest
ratified version of the debug mode spec and the registers were named dscratch0
and dscratch1. We still support using the old register name "dscratch", but it
would be disassembled as "dscratch0" with this change.
Reviewers: apazos, asb, lenary, luismarques
Reviewed By: asb
Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78764
Summary:
As described in https://github.com/WebAssembly/simd/pull/209. This is
the final reorganization of the SIMD opcode space before
standardization. It has been landed in concert with corresponding
changes in other projects in the WebAssembly SIMD ecosystem.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79224
This change add support for defined wasm globals in the .s format,
the MC layer, and wasm-ld
Currently there is no support custom initialization and all wasm
globals are initialized to zero.
Fixes: PR45742
Differential Revision: https://reviews.llvm.org/D79137
For compatibility with other assemblers on the platform, allow
using just plain integer register numbers in all places where a
register operand is expected.
Bug: llvm.org/PR45582
Summary:
AArch64's system register ERXTS_EL1 is present in the backend as a
component of the Arm Reliability, Availability and Serviceability (RAS)
extension. However, it has been removed from the specification before
its final release.
This patch removes the register.
Reviewers: SjoerdMeijer, DavidSpickett
Reviewed By: DavidSpickett
Subscribers: DavidSpickett, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79007
getTargetStreamer() might return null (e.g. when running inlined-strings.ll test),
downcasting to a reference will be wrong. This is detectable with -fsanitize=null.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D78686
D63847 added `MCInstrAnalysis::evaluateMemoryOperandAddress()`. This patch
leverages the feature to print the target addresses for evaluable instructions.
```
-400a: movl 4080(%rip), %eax
+400a: movl 4080(%rip), %eax # 5000 <data1>
```
This patch also deletes `MIA->isCall(Inst) || MIA->isUnconditionalBranch(Inst) || MIA->isConditionalBranch(Inst)`
which is used to guard `MCInstrAnalysis::evaluateBranch()`
Reviewed By: jhenderson, skan
Differential Revision: https://reviews.llvm.org/D78776
Summary:
It is important to emit HINT instructions instead of PAC ones when
PAC is disabled. This allows compatibility with other assemblers
(e.g. GAS). This was implemented in commit da33762de8.
Still, developers of assembly code will want to write code that is
compatible with both pre- and post-PAC CPUs. They could use HINT
mnemonics, but the new mnemonics are a lot more readable (e.g.
paciaz instead of hint #24), and they will result in the same
encodings. So, while LLVM should not *emit* the new mnemonics when
PAC is disabled, this patch will at least make LLVM *accept*
assembly code that uses them.
Reviewers: danielkiss, chill, olista01, LukeCheeseman, simon_tatham
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78372
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch32 and Assembly Parsing
D77872 has already added the MC representations of the instructions so that
they can be used in code gen; this patch fills in the details needed to
make assembly parsing work, and adds tests for asm and disasm
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: t.p.northover, simon_tatham
Reviewed By: simon_tatham
Subscribers: simon_tatham, ostannard, kristof.beyls, hiraditya,
danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77874
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch64 Scalable Vector Instructions (in line
with the Scalable Vector Extension - SVE)
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: t.p.northover, rengolin, c-rhodes
Reviewed By: c-rhodes
Subscribers: c-rhodes, ostannard, tschuett, kristof.beyls, hiraditya,
danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77873
This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
This patch includes:
- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)
No IR types or C Types are needed for this extension.
This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)
Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman
Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin
Reviewed By: kmclaughlin
Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77871
Summary:
This commit recommits the reversion of https://reviews.llvm.org/D75039.
Concensus appears to be in favour of assembly-time resolution of
these ADR and LDR relocations, in line with GNU. The previous
backout broke many lld tests, now fixed by Peter Smith in
61bccda9d9.
Reviewers: psmith
Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78301
Summary:
Before this patch, `relaxInstruction` takes three arguments, the first
argument refers to the instruction before relaxation and the third
argument is the output instruction after relaxation. There are two quite
strange things:
1) The first argument's type is `const MCInst &`, the third
argument's type is `MCInst &`, but they may be aliased to the same
variable
2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third
argument is a fresh uninitialized `MCInst` even if `relaxInstruction`
may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a
loop.
In this patch, we drop the thrid argument, and let `relaxInstruction`
directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and
breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon.
Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain
Reviewed By: Razer6, MaskRay, bcain
Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78364
According to DWARF standard, is_stmt is a global flag; when set or cleared it should affect subsequent .loc directives.
However llvm assembler handled is_stmt differently: it forced all locations to have is_stmt=1 unless is_stmt was specified explicitly as 0.
The fix utilizes current DWARF state flags to compute correct is_stmt values.
See https://bugs.llvm.org/show_bug.cgi?id=45529 for a detailed issue description.
Reviewers: arsenm, probinson, enderby
Differential Revision: https://reviews.llvm.org/D78102
Summary:
This commit reverts https://reviews.llvm.org/D75039. Concensus appears to
be in favour of assembly-time resolution of these ADR and LDR relocations,
in line with GNU.
Reviewers: psmith
Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78301
Summary:
The instruction in non-text section can not be executed, so they will not affect performance.
In addition, their encoding values are treated as data, so we should not touch them.
Reviewers: MaskRay, reames, LuoYuanke, jyknight
Reviewed By: MaskRay
Subscribers: annita.zhang, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77971
GNU as emits SHT_PROGBITS .eh_frame by default for .cfi_* directives.
We follow x86-64 psABI and use SHT_X86_64_UNWIND for .eh_frame
Don't error for SHT_PROGBITS .eh_frame on x86-64.
This keeps compatibility with `.section .eh_frame,"a",@progbits` in existing assembly files.
See https://groups.google.com/d/msg/x86-64-abi/7sr4E6THl3g/zUU2UPHOAQAJ
for more discussions.
Reviewed By: joerg
Differential Revision: https://reviews.llvm.org/D76151
For `.bss; nop`, MC inappropriately calls abort() (via report_fatal_error()) with a message
`cannot have fixups in virtual section!`
It is a bug to crash for invalid user input. Fix it by erroring out early in EmitInstToData().
Similarly, emitIntValue() in a virtual section (SHT_NOBITS in ELF) can crash with the mssage
`non-zero initializer found in section '.bss'` (see D4199)
It'd be nice to report the location but so many directives can call emitIntValue()
and it is difficult to track every location.
Note, COFF does not crash because MCAssembler::writeSectionData() is not
called for an IMAGE_SCN_CNT_UNINITIALIZED_DATA section.
Note, GNU as' arm64 backend reports ``Error: attempt to store non-zero value in section `.bss'``
for a non-zero .inst but fails to do so for other instructions.
We simply reject all instructions, even if the encoding is all zeros.
The Mach-O counterpart is D48517 (see `test/MC/MachO/zerofill-text.s`)
Reviewed By: rnk, skan
Differential Revision: https://reviews.llvm.org/D78138
As we disscussed in D77971, we haven't confirmed that if putting instructions
in a non-executable section is an undefined behaviour. To make things
easier to go on, we mark these sections executable in test file
align-branch-section-size.s.
The _GLOBAL_OFFSET_TABLE_ in SysVr4 ELF is conventionally the base of the
.got or .got.prel sections. Expressions such as _GLOBAL_OFFSET_TABLE_
- (.L1 +8) are used in assembler code to calculate offsets into the .got.
At present MC outputs a R_ARM_REL32 with respect to the
_GLOBAL_OFFSET_TABLE_ symbol, whereas gas outputs a R_ARM_BASE_PREL
relocation with respect to the _GLOBAL_OFFSET_TABLE_ symbol. While both are
correct the R_ARM_REL32 depends on the value of the _GLOBAL_OFFSET_TABLE_
symbol, wheras te R_ARM_BASE_PREL relocation is idependent of the symbol.
The R_ARM_BASE_PREL is therefore slightly more robust to linker's that may
not follow the conventional placement of _GLOBAL_OFFSET_TABLE_; for example
LLD for some time defined _GLOBAL_OFFSET_TABLE_ to 0.
Differential Revision: https://reviews.llvm.org/D46319
Otherwise, depending on the lit location used to run the test, llvm-mc adds an
include_directories entry in the dwarf output, which breaks tests in some setup.
Differential Revision: https://reviews.llvm.org/D77876
* Reorganize tests and add coverage
* Improve diagnostic testing
* Make assert() tests more relevant
* Rename tests to macro-* or altmacro-*
This is not NFC because a (previously untested) diagnostic message is changed.
Summary: We allow non-relaxable instructions emitted into relaxable Fragment when we prefix padding branch. So we need to check if the instruction need relaxation before relaxing it. Without this patch, it currently triggers a `report_fatal_error` in `llvm::MCAsmBackend::relaxInstruction` when we prefix padding branch along with `--mc-relax-all`.
Reviewers: LuoYuanke, reames, MaskRay
Reviewed By: MaskRay
Subscribers: MaskRay, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77851
This adds the instruction encoding and mnenomics for the proposed
RISC-V Bit Manipulation extension (version 0.92). It is implemented with
each category of instruction as its own target feature, with the 'b'
extension feature enabling all options. Since this extension is not yet
ratified, all target features are prefixed with 'experimental-' to note
their status.
Differential Revision: https://reviews.llvm.org/D65649
This implements the instruction analysis required to print branch
targets as part of llvm-objdump's disassembly.
Note, this only handles those branches which can be analyzed in a single
instruction, a future patch will handle multiple-instruction patterns,
such as AUIPC/LUI+JALR instruction pairs.
Differential Revision: https://reviews.llvm.org/D77567
On PowerPC most functions require a valid TOC pointer.
This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.
This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.
Differential Revision: https://reviews.llvm.org/D73664
Summary:
Since D75300 has been landed, I want to support enhanced relaxation when we need to align branches and allow prefix padding. "Enhanced Relaxtion" means we allow an instruction that could not be traditionally relaxed to be emitted into RelaxableFragment so that we increase its length by adding prefixes for optimization.
The motivation is straightforward, RelaxFragment is mostly for relative jumps and we can not increase the length of jumps when we need to align them, so if we need to achieve D75300's purpose (reducing the bytes of nops) when need to align jumps, we have to make more instructions "relaxable".
Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames
Subscribers: hiraditya, llvm-commits, annita.zhang
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76286
From Arm v8 Architecture Reference Manual F5.1.84 LDREXD
The ldrexd instruction in Arm state has the following conditions:
t = UInt(Rt); t2 = t + 1; n = UInt(Rn);
if Rt<0> == '1' || t2 == 15 || n == 15 then UNPREDICTABLE;
In when Rt is odd or if Rt is 14 (making t2 15).
In the implementation when the pair is the UNPREDICTABLE R14_R15 we
would ideally return SOFT_FAIL. We can't because there is no R14_R15
value for us to return so we fail early returning FAIL.
The early return for registers outside the bounds of the table means
the check for Rt == 14 (0xE) redundant which causes a static analyzer
to flag the condition as never being true.
To fix the warning I've removed the check and replaced with a comment
explaining the difference with the specification.
Fixes pr41660
Differential Revision: https://reviews.llvm.org/D77463
So that constant expressions like the following are permitted:
and w0, w0, #~(0xfe<<24)
and w1, w1, #~(0xff<<24)
The behavior matches GNU as (opcodes/aarch64-opc.c:aarch64_logical_immediate_p).
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D75885
Summary:
This patch upstreams support the optional ARMv8.0 Data Gathering Hint (DGH)
extension, which adds the Data Gathering Hint instruction to the hint
space.
See ARMv8.0-DGH in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, danielkiss, samparker
Reviewed By: SjoerdMeijer
Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77097
Summary:
This patch upstreams support for the ARMv8.6A Enhanced Counter Virtualization
(ECV) extension, which adds 6 new system registers.
See ARMv8.6-ECV in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, pcc, ab, chill
Reviewed By: SjoerdMeijer
Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77094
Summary:
This patch upstreams support for the ARMv8.6A Fine Grain Traps (FGT)
extension, which adds 5 new system registers.
See ARMv8.6-FGT in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, momchil.velikov
Reviewed By: SjoerdMeijer
Subscribers: LukeGeeson, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76991
Summary:
This patch upstreams v8.6A activity monitors virtualization
assembler support, which consists of 32 new system
registers (two groups, each with 16 numbered registers).
See ARMv8.6-AMU in the Arm Architecture Reference Manual Armv8 for more
information.
Reviewers: t.p.northover, rengolin, SjoerdMeijer, ab, john.brawn, ostannard
Reviewed By: ostannard
Subscribers: LukeGeeson, dnsampaio, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76998
1. It's redundant to explicitly check that string does not contain `PAUSE_MM`
token if check that the `PAUSE` token is followed by an angle bracket.
2. The removed `CHECK-NOT` directives do not have trailing colons.
Registers used in any address (as well as in a few other contexts)
have special semantics when a "zero" register is used, which is
why the back-end defines extra register classes ADDR32, ADDR64 etc
to be used to prevent the register allocator from using %r0 there.
However, when writing assembler code "by hand", you sometimes need
to trigger that special semantics. However, currently the AsmParser
will reject %r0 in those places. In some cases it may be possible
to write that instruction differently - but in others it is currently
not possible at all.
This check in AsmParser simply seems overly strict, so this patch
just removes the check completely. This brings the behaviour of
AsmParser in line with the GNU assembler as well.
Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45092
Leverage ARM ELF build attribute section to create ELF attribute section
for RISC-V. Extract the common part of parsing logic for this section
into ELFAttributeParser.[cpp|h] and ELFAttributes.[cpp|h].
Differential Revision: https://reviews.llvm.org/D74023
Generalizes D62014 (R_386_NONE/R_X86_64_NONE).
Unlike ARM (D76746) and AArch64 (D76754), we cannot delete FK_NONE from
getFixupKindSize because FK_NONE is still used by R_386_TLS_DESC_CALL/R_X86_64_TLSDESC_CALL.
Generalizes D61992. In GNU as, the .reloc directive supports arbitrary relocation types.
A MCFixupKind value `V` larger than or equal to FirstLiteralRelocationKind
is used to represent the relocation type whose number is V-FirstLiteralRelocationKind.
This is useful for linker tests. Without the feature the assembler
cannot produce certain relocation records (e.g. R_ARM_ALU_PC_G0/R_ARM_LDR_PC_G0)
This helps move forward D75349 and D76575.
Differential Revision: https://reviews.llvm.org/D76746
Summary:
There is a tiny logic error of D75300, making branch is not
correctly aligned with option -x86-pad-max-prefix-size
Reviewers: reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames
Subscribers: hiraditya, llvm-commits, annita.zhang
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76285
```
// llvm-objdump -d output (before)
400000: e8 0b 00 00 00 callq 11
400005: e8 0b 00 00 00 callq 11
// llvm-objdump -d output (after)
400000: e8 0b 00 00 00 callq 0x400010
400005: e8 0b 00 00 00 callq 0x400015
// GNU objdump -d. The lack of 0x is not ideal because the result cannot be re-assembled
400000: e8 0b 00 00 00 callq 400010
400005: e8 0b 00 00 00 callq 400015
```
In llvm-objdump, we pass the address of the next MCInst. Ideally we
should just thread the address of the current address, unfortunately we
cannot call X86MCCodeEmitter::encodeInstruction (X86MCCodeEmitter
requires MCInstrInfo and MCContext) to get the length of the MCInst.
MCInstPrinter::printInst has other callers (e.g llvm-mc -filetype=asm, llvm-mca) which set Address to 0.
They leave MCInstPrinter::PrintBranchImmAsAddress as false and this change is a no-op for them.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76580
Summary:
Below InstAlias have been redefined, this patch is to remove the repeated
definition.
mtdec/mfdec mtsdr1/mfsdr1 mtsrr0/mfsrr0 mtsrr1/mfsrr1 mtasr
Reviewed By: nemanjai, steven.zhang
Differential Revision: https://reviews.llvm.org/D75821
Summary:
This patch introduces command-line support for the Armv8.6-a architecture and assembly support for BFloat16. Details can be found
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a
in addition to the GCC patch for the 8..6-a CLI:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg02647.html
In detail this patch
- march options for armv8.6-a
- BFloat16 assembly
This is part of a patch series, starting with command-line and Bfloat16
assembly support. The subsequent patches will upstream intrinsics
support for BFloat16, followed by Matrix Multiplication and the
remaining Virtualization features of the armv8.6-a architecture.
Based on work by:
- labrinea
- MarkMurrayARM
- Luke Cheeseman
- Javed Asbar
- Mikhail Maltsev
- Luke Geeson
Reviewers: SjoerdMeijer, craig.topper, rjmccall, jfb, LukeGeeson
Reviewed By: SjoerdMeijer
Subscribers: stuij, kristof.beyls, hiraditya, dexonsmith, danielkiss, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D76062
This directive inserts code to add $gp to the argument's register when
support for position independent code is enabled.
For example, this code:
.cpadd $4
expands to:
addu $4, $4, $gp
Summary:
These were merged to the SIMD proposal in
https://github.com/WebAssembly/simd/pull/128.
Depends on D76397 to avoid merge conflicts.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76399
Add pseudo instructions for ldrsbt/ldrht/ldrsht with implicit immediate
and add fall back C++ code to transform the instruction to the
equivalent LDRSBTi/LDRHTi/LDRSHTi form.
This is similar to how it has been done in commit
fb3950ec63
This fixes:
https://bugs.llvm.org/show_bug.cgi?id=45070
Summary:
The current relaxation implementation is not correctly adjusting the
size and offsets of fragements in one section based on changes in size
of another if the layout order of the two happened to be such that the
former was visited before the later. Therefore, we need to invalidate
the fragments in all sections after each iteration of relaxation, and
possibly further relax some of them in the next ieration. This fixes
PR#45190.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76114
Note that I'm not asserting this code is correct; I'm simply adding coverage for what's there already. I'm reasonable sure the logic works for existing relaxable instructions, but I wouldn't be suprised if there were incorrect cases for other instructions. (i.e. is it legal to add prefixes to all instructions?)
Follow-up for D74433
What the function returns are almost standard BFD names, except that "ELF" is
in uppercase instead of lowercase.
This patch changes "ELF" to "elf" and changes ARM/AArch64 to use their BFD names.
MIPS and PPC64 have endianness differences as well, but this patch does not intend to address them.
Advantages:
* llvm-objdump: the "file format " line matches GNU objdump on ARM/AArch64 objects
* "file format " line can be extracted and fed into llvm-objcopy -O literally.
(https://github.com/ClangBuiltLinux/linux/issues/779 has such a use case)
Affected tools: llvm-readobj, llvm-objdump, llvm-dwarfdump, MCJIT (internal implementation detail, not exposed)
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D76046
Now that D75203 has landed and baked for a few days, extend the basic approach to prefix padding as well. The patch itself is fairly straight forward.
For the moment, this patch adds the functional support and some basic testing there of, but defaults to not enabling prefix padding. I want to be able to phrase a separate patch which adds the target specific reasoning and test it cleanly. I haven't decided whether I want to common it with the nop logic or not.
Differential Revision: https://reviews.llvm.org/D75300
These symbols need to be external (MSVC tools error out if a weak
external points at a symbol that isn't external; this was tried before
but had to be reverted in bc5b7217dc,
and this was originally explicitly fixed in
732eeaf2a9).
If multiple object files have weak symbols with defaults, their
defaults could cause linker errors due to duplicate definitions,
unless the names of the defaults are unique.
GNU binutils handles this by appending the name of another symbol
from the same object file to the name of the default symbol. Try
to implement something similar; before writing the object file,
locate a symbol that should have a unique name and use the name of
that one for making the weak defaults unique.
Differential Revision: https://reviews.llvm.org/D75989
Back in D42616, we switched our default nop length from 15 to 10 bytes because some platforms have painful decode stalls when encountering multiple instruction prefixes. (10 byte long nops come from the fact that prefixes are used to pad after 8 bytes, and some platforms have issues w/more than two prefixes.)
Based on Agner's guides, it appears to be the case that modern Intel (SandyBridge and later) can decode an arbitrary number of prefixes without issue. Intel's guide only provides up to 9 bytes; I read that as providing a safe default for all their chips. Older chips and Atom series have serious decode stalls. I can't find a conclusive reference beyond those two.
Differential Revision: https://reviews.llvm.org/D75945
Summary:
Currently, a BoundaryAlign fragment may be inserted after the branch
that needs to be aligned to truncate the current fragment, this fragment is
unused at most of time. To avoid that, we can insert a new empty Data
fragment instead. Non-relaxable instruction is usually emitted into Data
fragment, so the inserted empty Data fragment will be reused at a high
possibility.
Reviewers: annita.zhang, reames, MaskRay, craig.topper, LuoYuanke, jyknight
Reviewed By: reames, LuoYuanke
Subscribers: llvm-commits, dexonsmith, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75438
Summary:
Dollar signed prefixed integers were not allowed by the AsmParser to be
used as Identifiers, differing from the GNU assembler behavior.
This patch updates the parsing of Identifiers to consider such cases as
valid, where the identifier string includes the $ prefix itself. As the
Lexer currently splits these occurrences into separate tokens, those
need to be combined by the AsmParser itself.
Reviewers: efriedma, chill
Reviewed By: efriedma
Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75111
Summary:
ARMAsmParser was incorrectly dropping a leading dollar sign character
from symbol names in targets of branch instructions. This was caused by
an incorrect assumption that the contents following the dollar sign
token should be handled as a constant immediate, similarly to the #
token.
This patch avoids the operand parsing from consuming the dollar sign
token when it is followed by an identifier, making sure it is properly
parsed as part of the expression.
Reviewers: efriedma
Reviewed By: efriedma
Subscribers: danielkiss, chill, carwil, vhscampos, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73176
The new behavior matches GNU objdump. A pair of angle brackets makes tests slightly easier.
`.foo:` is not unique and thus cannot be used in a `CHECK-LABEL:` directive.
Without `-LABEL`, the CHECK line can match the `Disassembly of section`
line and causes the next `CHECK-NEXT:` to fail.
```
Disassembly of section .foo:
0000000000001634 .foo:
```
Bdragon: <> has metalinguistic connotation. it just "feels right"
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D75713
This fixes several issues. The behavior changes are:
A SHN_COMMON symbol does not have the 'g' flag.
An undefined symbol does not have 'g' or 'l' flag.
A STB_GLOBAL SymbolRef::ST_Unknown symbol has the 'g' flag.
A STB_LOCAL SymbolRef::ST_Unknown symbol has the 'l' flag.
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D75659
If we have an explicit align directive, we currently default to emitting nops to fill the space. As discussed in the context of the prefix padding work for branch alignment (D72225), we're allowed to play other tricks such as extending the size of previous instructions instead.
This patch will convert near jumps to far jumps if doing so decreases the number of bytes of nops needed for a following align. It does so as a post-pass after relaxation is complete. It intentionally works without moving any labels or doing anything which might require another round of relaxation.
The point of this patch is mainly to mock out the approach. The optimization implemented is real, and possibly useful, but the main point is to demonstrate an approach for implementing such "pad previous instruction" approaches. The key notion in this patch is to treat padding previous instructions as an optional optimization, not as a core part of relaxation. The benefit to this is that we avoid the potential concern about increasing the distance between two labels and thus causing further potentially non-local code grown due to relaxation. The downside is that we may miss some opportunities to avoid nops.
For the moment, this patch only implements a small set of existing relaxations.. Assuming the approach is satisfactory, I plan to extend this to a broader set of instructions where there are obvious "relaxations" which are roughly performance equivalent.
Note that this patch *doesn't* change which instructions are relaxable. We may wish to explore that separately to increase optimization opportunity, but I figured that deserved it's own separate discussion.
There are possible downsides to this optimization (and all "pad previous instruction" variants). The major two are potentially increasing instruction fetch and perturbing uop caching. (i.e. the usual alignment risks) Specifically:
* If we pad an instruction such that it crosses a fetch window (16 bytes on modern X86-64), we may cause the decoder to have to trigger a fetch it wouldn't have otherwise. This can effect both decode speed, and icache pressure.
* Intel's uop caching have particular restrictions on instruction combinations which can fit in a particular way. By moving around instructions, we can both cause misses an change misses into hits. Many of the most painful cases are around branch density, so I don't expect this to be too bad on the whole.
On the whole, I expect to see small swings (i.e. the typical alignment change problem), but nothing major or systematic in either direction.
Differential Revision: https://reviews.llvm.org/D75203
```
// clang -c -gdwarf-5 a.s -o a.o
.section .init; ret
.text; ret
```
.debug_info contains DW_AT_ranges and llvm-dwarfdump will report
a verification error because .debug_rnglists does not exist (not
implemented).
This patch generates .debug_rnglists for assembly files.
emitListsTableHeaderStart() in DwarfDebug.cpp can be shared with
MCDwarf.cpp. Because CodeGen depends on MC, I move the function to
MCDwarf.cpp
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D75375
X86 has several instructions which are documented as enabling interrupts exactly one instruction *after* the one which changes the SS segment register. Inserting a nop between these two instructions allows an interrupt to arrive before the execution of the following instruction which changes semantic behaviour.
The list of instructions is documented in "Table 24-3. Format of Interruptibility State" in Volume 3c of the Intel manual. They basically all come down to different ways to write to the SS register.
Differential Revision: https://reviews.llvm.org/D75359
When bundle is enabled, data fragment itself has a space to emit NOP
to bundle-align instructions. The behaviour makes it impossible for
us to determine whether the macro fusion really happen when emitting
instructions. In addition, boundary-align fragment is also used to
emit NOPs to align instructions, currently using them together sometimes
makes code crazy.
Differential Revision: https://reviews.llvm.org/D75346
Add ELF relocations for the following fixups:
fixup_thumb_adr_pcrel_10 -> R_ARM_THM_PC8
fixup_thumb_cp -> R_ARM_THM_PC8
fixup_t2_adr_pcrel_12 -> R_ARM_THM_PREL_11_0
fixup_t2_ldst_pcrel_12 -> R_ARM_THM_PC12
While these relocations are short-ranged there is support in the open
source ELF linker's in binutils and soon to be in LLD. MC will no longer
resolve pc-relative fixups to global symbols due to interpositioning
concerns. We can handle these at link time by implementing the relocations.
The R_ARM_THM_PC8 has some extra encoding rules for addends that llvm-mc
sidesteps by not supporting addends for these instructions, using the wide
Thumb 2 instruction if it is available. I think that this is a reasonable
compromise given that these are rare.
This partiall reverts D72892, the Thumb fixups no longer need to be
evaluated at assembly time.
Differential Revision: https://reviews.llvm.org/D75039
Support the explicit wide assembler qualifier for the dmb/dsb/isb synchronization barrier instructions.
Differential revision: https://reviews.llvm.org/D75143
MC currently does not emit these relocation types, and lld does not
handle them. Add FKF_Constant as a work-around of some ARM code after
D72197. Eventually we probably should implement these relocation types.
By Fangrui Song!
Differential revision: https://reviews.llvm.org/D72892
Adjusting by 2 breaks DWARF output. With this fix, programs start to
compile and produce valid DWARF output.
Differential Revision: https://reviews.llvm.org/D74213
While the value of the CIE pointer field in a DWARF FDE record is
an offset to the corresponding CIE record from the beginning of
the section, for EH FDE records it is relative to the current offset.
Previously, we did not make that distinction when dumped both kinds
of FDE records and just printed the same value for the CIE pointer
field and the CIE offset; that was acceptable for DWARF FDEs but was
wrong for EH FDEs.
This patch fixes the issue by explicitly printing the offset of the
linked CIE object.
Differential Revision: https://reviews.llvm.org/D74613
Simply by implementing a few functions I was able to correctly
disassemble a much larger amount of instructions.
Differential Revision: https://reviews.llvm.org/D74045
Heads-up message: https://lists.llvm.org/pipermail/llvm-dev/2020-February/139390.html
GNU as started to emit warnings for changed sh_type or sh_flags in 2000.
GNU as>=2.35 will emit errors for most sh_type/sh_flags change, and error for entsize change.
Some cases remain warnings for legacy reasons:
.section .init_array,"ax", @progbits
.section .init_array,"ax", @init_array
# And some obscure sh_flags changes (OS/Processor specific flags)
The rationale of a diagnostic (warning or error) is that sh_type,
sh_flags or sh_entsize changes usually indicate user errors. The values
are taken from the first .section directive. Successive directives are ignored.
We just try to be rigid and emit errors for all sh_type/sh_flags/sh_entsize change.
A possible improvement in the future is to reuse
llvm-readobj/ELFDumper.cpp:getSectionTypeString so that we can name the
type in the diagnostics.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D73999
The changes the in-memory representation of wasm symbols such that their
optional ImportName and ImportModule use llvm::Optional.
ImportName is set whenever WASM_SYMBOL_EXPLICIT_NAME flag is set.
ImportModule (for imports) is currently always set since it defaults to
"env".
In the future we can possibly extent to binary format distingish
import which have explit module names.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74109
Summary:
Extends the multivalue call infrastructure to tail calls, removes all
legacy calls specialized for particular result types, and removes the
CallIndirectFixup pass, since all indirect call arguments are now
fixed up directly in the post-insertion hook.
In order to keep supporting pretty-printed defs and uses in test
expectations, MCInstLower now inserts an immediate containing the
number of defs for each call and call_indirect. The InstPrinter is
updated to query this immediate if it is present and determine which
MCOperands are defs and uses accordingly.
Depends on D72902.
Reviewers: aheejin
Subscribers: dschuff, mgorny, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74192
Summary:
This patch adds assembly-level support for a new Arm M-profile
architecture extension, Custom Datapath Extension (CDE).
A brief description of the extension is available at
https://developer.arm.com/architectures/instruction-sets/custom-instructions
The latest specification for CDE is currently a beta release and is
available at
https://static.docs.arm.com/ddi0607/aa/DDI0607A_a_armv8m_arm_supplement_cde.pdf
CDE allows chip vendors to add custom CPU instructions. The CDE
instructions re-use the same encoding space as existing coprocessor
instructions (such as MRC, MCR, CDP etc.). Each coprocessor in range
cp0-cp7 can be configured as either general purpose (GCP) or custom
datapath (CDEv1). This configuration is defined by the CPU vendor and
is provided to LLVM using 8 subtarget features: cdecp0 ... cdecp7.
The semantics of CDE instructions are implementation-defined, but the
instructions are guaranteed to be pure (that is, they are stateless,
they do not access memory or any registers except their explicit
inputs/outputs).
CDE requires the CPU to support at least Armv8.0-M mainline
architecture. CDE includes 3 sets of instructions:
* Instructions that operate on general purpose registers and NZCV
flags
* Instructions that operate on the S or D register file (require
either FP or MVE extension)
* Instructions that operate on the Q register file, require MVE
The user-facing names that can be specified on the command line are
the same as the 8 subtarget feature names. For example:
$ clang -target arm-none-none-eabi -march=armv8m.main+cdecp0+cdecp3
tells the compiler that the coprocessors 0 and 3 are configured as
CDEv1 and the remaining coprocessors are configured as GCP (which is
the default).
Reviewers: simon_tatham, ostannard, dmgreen, eli.friedman
Reviewed By: simon_tatham
Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74044
https://bugs.llvm.org/show_bug.cgi?id=44775
This rule has been implemented by GNU as https://sourceware.org/ml/binutils/2020-02/msg00028.html (binutils >= 2.35)
It allows us to simplify
```
.section .foo,"o",foo,unique,0
.section .foo,"o",bar,unique,1 # different section
```
to
```
.section .foo,"o",foo
.section .foo,"o",bar # different section
```
We consider the two `.foo` different even if the linked-to symbols foo and bar
are defined in the same section. This is a deliberate choice so that we don't
need to know the section where foo and bar are defined beforehand.
Differential Revision: https://reviews.llvm.org/D74006
Assembler now permits pairs like 'v0:1', which are encoded
differently from the odd-first pairs like 'v1:0'.
The compiler will require more work to leverage these new register
pairs.
This reverts commit 80a34ae311 with fixes.
Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
This reverts commit 80a34ae311 with fixes.
On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
We do not keep the actual value of the CIE ID field, because it is
predefined, and use a constant when dumping a CIE record. The issue
was that the predefined value is different for .debug_frame and
.eh_frame sections, but we always printed the one which corresponds
to .debug_frame. The patch fixes that by choosing an appropriate
constant to print.
See the following for more information about .eh_frame sections:
https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/ehframechpt.html
Differential Revision: https://reviews.llvm.org/D73627
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.
There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
https://reviews.llvm.org/D74273
Pad macho section data to pointer size bytes, so that relocation
table and symbol table following section data will be pointer size
aligned.
Patch by pguo.
This patch makes the following System Registers Read Only:
- CurrentEL
- ICH_MISR_EL2
- PMBIDR_EL1
- PMSIDR_EL1
as found in:
https://developer.arm.com/docs/ddi0595/e/aarch64-system-registers
Relative line numbers were also added to the tests so we get more
informative error messages on failure.
Change-Id: I963b4f01ca5737b58f9e8e7abe9ca1d99e328758
Summary:
This is a rework of D72611, using @LINE to check that errors are
reported against the right instruction instead of adding lots of extra
*-ERR-NEXT: check lines.
Reviewers: rampitec, arsenm, nhaehnle
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74227
The registers TRCEXTINSELR and TRCEXTINSELR0 are distinct registers,
defined by separate extension specifications (ETM and ETE,
respectively), yet they use the same encoding in MSR/MRS.
When performing a system register lookup by encoding, we would
essentially return a random one, depending on the number, relative
position in the TableGen file, whether the TableGen records for system
registers are named or not, and, if they are named, depending on
record (not register!) name as well.
This patch works around the issue by explictly checking for the
TRCEXTINSELR/TRCEXTINSELR0 encoding and always returning TRCEXTINSELR.
Differential Revision: https://reviews.llvm.org/D74074
"linked-to section" is used by the ELF spec. By analogy, "linked-to
symbol" is a good name for the signature symbol. The word "linked-to"
implies a directed edge and makes it clear its relation with "sh_link",
while one can argue that "associated" means an undirected edge.
Also, combine tests and add precise SMLoc to improve diagnostics.
Reviewed By: eugenis, grimar, jhenderson
Differential Revision: https://reviews.llvm.org/D74082
The disassembler of the AVR backend is incomplete: most instructions do
not correctly disassemble yet.
This patch is the first in a series to add disassembly support to the
AVR backend. It starts with adding disassembler tests for instructions
that already disassemble correctly.
Differential Revision: https://reviews.llvm.org/D73911
Summary:
Implements the jump pseudo-instruction, which is used in e.g. the Linux kernel.
Reviewers: asb, lenary
Reviewed By: lenary
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73178
A known limitation for Future CPU is that the new prefixed instructions may
not cross 64 Byte boundaries.
All instructions are already 4 byte aligned so the only situation where this
can occur is when the prefix is in one 64 byte block and the instruction that
is prefixed is at the top of the next 64 byte block. To fix this case
PPCELFStreamer was added to intercept EmitInstruction. When a prefixed
instruction is emitted we try to align it to 64 Bytes by adding a maximum of
4 bytes. If the prefixed instruction crosses the 64 Byte boundary then the
alignment would trigger and a 4 byte nop would be added to push the
instruction into the next 64 byte block.
Differential Revision: https://reviews.llvm.org/D72570
A previous patch should have added pld and pstd and any support code in
the backend that is required for prefixed load and store type operations.
This patch adds a number of additional prefixed load and store type
instructions for the future CPU.
Differential Revision: https://reviews.llvm.org/D72577
This adds some missing MVE opcodes to evaluateBranch, which results in
llvm-objdump being able to print the PC relative branch target as an
annotation.
Differential Revision: https://reviews.llvm.org/D73553
Add the prefixed instructions pld and pstd to future CPU. These are load and
store instructions that require new operand types that are 34 bits. This patch
adds the two instructions as well as the operand types required.
Note that this patch also makes a minor change to tablegen to account for the
fact that some instructions are going to require shifts greater than 31 bits
for the new 34 bit instructions.
Differential Revision: https://reviews.llvm.org/D72574
Future CPU will include support for prefixed instructions.
These prefixed instructions are formed by a 4 byte prefix
immediately followed by a 4 byte instruction effectively
making an 8 byte instruction. The new instruction paddi
is a prefixed form of addi.
This patch adds paddi and all of the support required
for that instruction. The majority of the patch deals with
supporting the new prefixed instructions. The addition of
paddi is mainly to allow for testing.
Differential Revision: https://reviews.llvm.org/D72569
Summary:
This patch applies D60551 to an additional file. In particular, the test
is currently marked XFAIL for a number of big-endian targets; however,
the failure is actually dependent on the host endianness instead. The
test actually specifies a specific target triple.
Reviewers: rampitec, xingxue, daltenty
Reviewed By: rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, fedor.sergeev, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73192
Summary:
Previously, we would erroneously turn %pcrel_lo(label), where label has
a %pcrel_hi against a weak symbol, into %pcrel_lo(label + offset), as
evaluatePCRelLo would believe the target independent logic was going to
fold it. Moreover, even if that were fixed, shouldForceRelocation lacks
an MCAsmLayout and thus cannot evaluate the %pcrel_hi fixup to a value
and check the symbol, so we would then erroneously constant-fold the
%pcrel_lo whilst leaving the %pcrel_hi intact. After D72197, this same
sequence also occurs for symbols with global binding, which is triggered
in real-world code.
Instead, as discussed in D71978, we introduce a new FKF_IsTarget flag to
avoid these kinds of issues. All the resolution logic happens in one
place, with no coordination required between RISCAsmBackend and
RISCVMCExpr to ensure they implement the same logic twice. Although the
implementation of %pcrel_hi can be left as target independent, we make
it target dependent to ensure that they are handled identically to
%pcrel_lo, otherwise we risk one of them being constant folded but the
other being preserved. This also allows us to properly support fixup
pairs where the instructions are in different fragments.
Reviewers: asb, lenary, efriedma
Reviewed By: efriedma
Subscribers: arichardson, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73211
The idea is to produce R_X86_64_PLT32 instead of
R_X86_64_PC32 for branches.
It fixes https://bugs.llvm.org/show_bug.cgi?id=44397.
This patch teaches MC to do that for JCC (jump if condition is met)
instructions. The new behavior matches modern GNU as.
It is similar to D43383, which did the same for "call/jmp foo",
but missed JCC cases.
Differential revision: https://reviews.llvm.org/D72831
This change has 2 components:
Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It
describes how the Dwarf frame base will be encoded. That can be a register (the
default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or
a DW_OP_WASM_location descriptr.
WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the
correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs
has run. Make WebAssemblyExplicitLocals store the local it allocates for the
frame register. Use this local information to implement getDwarfFrameBase
The result is that the DW_AT_frame_base attribute is correctly encoded for each
subprogram, and each param and local variable has a correct DW_AT_location that
uses DW_OP_fbreg to refer to the frame base.
This is a reland of rG3a05c3969c18 with fixes for the expensive-checks
and Windows builds
Differential Revision: https://reviews.llvm.org/D71681