Commit Graph

8315 Commits

Author SHA1 Message Date
Cullen Rhodes 3a349d2269 [AArch64][SME] Introduce feature for streaming mode
The Scalable Matrix Extension (SME) introduces a new execution mode
called Streaming SVE mode. In streaming mode a substantial subset of the
SVE and SVE2 instruction set is available, along with new outer product,
load, store, extract and insert instructions that operate on the new
architectural register state for the matrix.

To support streaming mode this patch introduces a new subtarget feature
+streaming-sve. If enabled, the subset of SVE(2) instructions are
available. The existing behaviour for SVE(2) remains unchanged, the
subset of instructions that are legal in streaming mode are enabled if
either +sve[2] or +streaming-sve is specified. Instructions that are
illegal in streaming mode remain predicated on +sve[2].

The SME target feature has been updated to imply +streaming-sve rather
than +sve.

The following changes are made to the SVE(2) tests:
  * For instructions that are legal in streaming mode:
    - added RUN line to verify +streaming-sve enables the instruction.
    - updated diagnostic to 'instruction requires: streaming-sve or sve'.
  * For instructions that are illegal in streaming-mode:
    - added RUN line to verify +streaming-sve does not enable the
      instruction.

SVE(2) instructions that are legal in streaming mode have:

  if !HaveSVE[2]() && !HaveSME() then UNDEFINED;

at the top of the pseudocode in the XML.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions

Reviewed By: sdesmalen, david-arm

Differential Revision: https://reviews.llvm.org/D106272
2021-07-30 07:30:45 +00:00
Mark Schimmel e622c99f30 [ARC] Add norm/normh instructions with disassembly tests
Add disassembler support for the NORM and NORMH instructions. These instructions
only exist when the ARC processor is configured with the "norm" extension.

fferential Revision: https://reviews.llvm.org/D107118
2021-07-29 17:54:52 -07:00
Thomas Johnson cc238a6e03 [ARC] Add additional mov immediate instruction formats with a fix for u6 decoding
Differential Revision: https://reviews.llvm.org/D107088
2021-07-29 16:41:55 -07:00
Cullen Rhodes 2e27c4e1f1 [AArch64][SME] Add zero instruction
This patch adds the zero instruction for zeroing a list of 64-bit
element ZA tiles. The instruction takes a list of up to eight tiles
ZA0.D-ZA7.D, which must be in order, e.g.

  zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d}
  zero {za1.d,za3.d,za5.d,za7.d}

The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which
are mapped to corresponding 64-bit element tiles in accordance with the
architecturally defined mapping between different element size tiles,
e.g.

  * Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing
    all eight 64-bit element tiles ZA0.D to ZA7.D.
  * Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D.

The preferred disassembly of this instruction uses the shortest list of
tile names that represent the encoded immediate mask, e.g.

  * An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and
    ZA5.D is disassembled as {ZA0.S, ZA1.S}.
  * An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and
    ZA6.D is disassembled as {ZA0.H}.
  * An all-ones immediate is disassembled as {ZA}.
  * An all-zeros immediate is disassembled as an empty list {}.

This patch adds the MatrixTileList asm operand and related parsing to support
this.

Depends on D105570.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105575
2021-07-27 08:35:45 +00:00
Lei Huang 64a15817a0 [PowerPC]Add addex instruction definition and MC tests
Add td definitions and asm/disasm tests for the addex instruction introduced in
ISA 3.0.

Reviewed By: nemanjai, amyk, NeHuang

Differential Revision: https://reviews.llvm.org/D106666
2021-07-26 14:55:38 -05:00
Michael Liao b0402a35fc [amdgpu] Add 64-bit PC support when expanding unconditional branches.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106445
2021-07-26 14:50:30 -04:00
Ulrich Weigand 8cd8120a7b [SystemZ] Add support for new cpu architecture - arch14
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.

This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining  __VEC__ == 10304.

Note: No currently available Z system supports the arch14
architecture.  Once new systems become available, the
official system name will be added as supported -march name.
2021-07-26 16:57:28 +02:00
Cullen Rhodes e6ff9179ce [AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc()
Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D106635
2021-07-26 09:36:34 +00:00
Thomas Johnson 51d8e67e88 [ARC] Add tablegen definition for the Find Leading Set (FLS) instruction
Differential Revision: https://reviews.llvm.org/D106602
2021-07-22 17:42:25 -07:00
Thomas Johnson 1cda1e6186 [ARC] Add disassembly for the conditioned RSUB immediate instruction
Differential Revision: https://reviews.llvm.org/D106497
2021-07-22 11:34:39 -07:00
Cullen Rhodes 00e87e1c5b [AArch64][SME] Improve diagnostic for vector select register
Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D106540
2021-07-22 13:46:40 +00:00
Simon Tatham bd41136746 [clang] Use i64 for the !srcloc metadata on asm IR nodes.
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.

!srcloc is generated in clang codegen, and pulled back out by llvm
functions like AsmPrinter::emitInlineAsm that need to report errors in
the inline asm. From there it goes to LLVMContext::emitError, is
stored in DiagnosticInfoInlineAsm, and ends up back in clang, at
BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true
clang::SourceLocation from the integer cookie.

Throughout this code path, it's now 64-bit rather than 32, which means
that if SourceLocation is expanded to a 64-bit type, this error report
won't lose half of the data.

The compiler will tolerate both of i32 and i64 !srcloc metadata in
input IR without faulting. Test added in llvm/MC. (The semantic
accuracy of the metadata is another matter, but I don't know of any
situation where that matters: if you're reading an IR file written by
a previous run of clang, you don't have the SourceManager that can
relate those source locations back to the original source files.)

Original version of the patch by Mikhail Maltsev.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D105491
2021-07-22 10:24:52 +01:00
Carl Ritson 6efb3220b4 [AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr.
Previously these were rounded up to VReg_256 this saves VGPRs.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D103800
2021-07-22 10:42:15 +09:00
Cullen Rhodes 008c755d76 [AArch64][SME] Support .arch and .arch_extension assembler directives
Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105566
2021-07-21 08:40:27 +00:00
Cullen Rhodes 2d80bbd939 [AArch64][SME] Add mova instructions
This patch adds the mova instruction to insert/extract an SVE vector
register to/from a ZA tile vector.

The preferred MOV aliases are also implemented.

Depends on D105572.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm, CarolineConcatto

Differential Revision: https://reviews.llvm.org/D105574
2021-07-21 08:20:01 +00:00
Cullen Rhodes 6c32cfe85c [AArch64][SME] Add ldr and str instructions
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D105573
2021-07-21 08:17:13 +00:00
Craig Topper 81efb82570 [RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants.
If we need to shift left anyway we might be able to take advantage
of LUI implicitly shifting its immediate left by 12 to cover part
of the shift. This allows us to use more bits of the LUI immediate
to avoid an ADDI.

isDesirableToCommuteWithShift now considers compressed instruction
opportunities when deciding if commuting should be allowed.

I believe this is the same or similar to one of the optimizations
from D79492.

Reviewed By: luismarques, arcbbb

Differential Revision: https://reviews.llvm.org/D105417
2021-07-20 09:22:06 -07:00
Cullen Rhodes 15af3aaa2e [AArch64][SME] Add system registers and related instructions
This patch adds the new system registers introduced in SME:

  - ID_AA64SMFR0_EL1 (ro) SME feature identifier.
  - SMCR_ELx (r/w) streaming mode control register for configuring
    effective SVE Streaming SVE Vector length when the PE is in
    Streaming SVE mode.
  - SVCR (r/w) streaming vector control register, visible at all
    exception levels. Provides access to PSTATE.SM and PSTATE.ZA
    using MSR and MRS instructions.
  - SMPRI_EL1 (r/w) streaming mode execution priority register.
  - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register.
  - SMIDR_EL1 (ro) streaming mode identification register.
  - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread
    SME context.
  - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for
    labelling memory accesses performed in streaming mode.

Also added in this patch are the SME mode change instructions.
Three MSR immediate instructions are implemented to set or clear
PSTATE.SM, PSTATE.ZA, or both respectively:

  - MSR SVCRSM, #<imm1>
  - MSR SVCRZA, #<imm1>
  - MSR SVCRSMZA, #<imm1>

The following smstart/smstop aliases are also implemented for
convenience:

  smstart    -> MSR SVCRSMZA, #1
  smstart sm -> MSR SVCRSM,   #1
  smstart za -> MSR SVCRZA,   #1

  smstop     -> MSR SVCRSMZA, #0
  smstop sm  -> MSR SVCRSM,   #0
  smstop za  -> MSR SVCRZA,   #0

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105576
2021-07-20 08:06:26 +00:00
Derek Schuff ad1f5457d2 [WebAssembly] Generate R_WASM_FUNCTION_OFFSET relocs in debuginfo sections
Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup
kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data
sections). Usually this is done in a convoluted way, with unnamed temp data
symbols which target the start of the function, in which case
WasmObjectWriter::recordRelocation converts it to use the section symbol
instead. However in some cases the function can actually be undefined; in this
case the dwarf generator uses the function symbol (a named undefined function
symbol) instead. In that case the section-symbol transform doesn't work and we
need to generate the correct reloc type a different way. In this change
WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into
account to choose the correct reloc type.

Fixes PR50408
Differential Revision: https://reviews.llvm.org/D103557
2021-07-19 14:02:33 -07:00
Wouter van Oortmerssen 670944fb20 [WebAssembly] Support R_WASM_MEMORY_ADDR_TLS_SLEB64 for wasm64
Also fixed TLS tests swapping addr & value in store op
Differential Revision: https://reviews.llvm.org/D106096
2021-07-19 10:22:43 -07:00
Cullen Rhodes f91eaa7007 [AArch64][SME] Add SVE2 instructions added in SME
This patch adds support for the following instructions:

    SCLAMP, UCLAMP, REV, DUP (predicate)

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: kmclaughlin

Differential Revision: https://reviews.llvm.org/D105577
2021-07-19 08:03:05 +00:00
Craig Topper 4dbb788068 [RISCV] Teach constant materialization that it can use zext.w at the end with Zba to reduce number of instructions.
If the upper 32 bits are zero and bit 31 is set, we might be able to
use zext.w to fill in the zeros after using an lui and/or addi.

Most of this patch is plumbing the subtarget features into the constant
materialization.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D105509
2021-07-16 09:35:56 -07:00
Cullen Rhodes 99eb96f031 [AArch64][SME] Add load and store instructions
This patch adds support for following contiguous load and store
instructions:

  * LD1B, LD1H, LD1W, LD1D, LD1Q
  * ST1B, ST1H, ST1W, ST1D, ST1Q

A new register class and operand is added for the 32-bit vector select
register W12-W15. The differences in the following tests which have been
re-generated are caused by the introduction of this register class:

  * llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll
  * llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir
  * llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir
  * llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir

D88663 attempts to resolve the issue with the store pair test
differences in the AArch64 load/store optimizer.

The GlobalISel differences are caused by changes in the enum values of
register classes, tests have been updated with the new values.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D105572
2021-07-16 10:11:10 +00:00
Harald van Dijk a8ad917054
[X86] Fix handling of maskmovdqu in X32
The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit
variant, the former using EDI, the latter RDI, but the use of the
register is implicit. In 64-bit mode, a 0x67 prefix can be used to get
the version using EDI, but there is no way to express this in
assembly in a single instruction, the only way is with an explicit
addr32.

This change adds support for the instruction. When generating assembly
text, that explicit addr32 will be added. When not generating assembly
text, it will be kept as a single instruction and will be emitted with
that 0x67 prefix. When parsing assembly text, it will be re-parsed as
ADDR32 followed by MASKMOVDQU64, which still results in the correct
bytes when converted to machine code.

The same applies to vmaskmovdqu as well.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103427
2021-07-15 22:56:08 +01:00
Fangrui Song aa3df8ddcd [test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers) 2021-07-15 10:26:21 -07:00
Cullen Rhodes dfa76933c2 [AArch64][SME] Add outer product instructions
This patch adds support for the following outer product instructions:

  * BFMOPA, BFMOPS, FMOPA, FMOPS, SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA,
    UMOPS, USMOPA, USMOPS.

Depends on D105570.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105571
2021-07-15 09:51:06 +00:00
Thomas Lively 122b0220fd [WebAssembly] Remove datalayout strings from llc tests
The data layout strings do not have any effect on llc tests and will become
misleadingly out of date as we continue to update the canonical data layout, so
remove them from the tests.

Differential Revision: https://reviews.llvm.org/D105842
2021-07-14 11:17:08 -07:00
Jinsong Ji fe52296a34 [AIX] Enable dollar sign as PC in inlineasm
$ is used as PC for PowerPC inlineasm, ELF use it,
enable it for AIX XCOFF as well.

Reviewed By: #powerpc, amyk, nemanjai

Differential Revision: https://reviews.llvm.org/D105956
2021-07-14 13:37:52 +00:00
Cullen Rhodes c08dabb0f4 [AArch64][SME] Add matrix register definitions and parsing support
SME introduces the ZA array, a new piece of architectural register state
consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the
implementation defined Streaming SVE vector length and SVLb is the
number of 8-bit elements in a vector of SVL bits.

SME instructions consist of three types of matrix operands:

  * Tiles: a ZA tile is a square, two-dimensional sub-array of elements
  within the ZA array. These tiles make up the larger accumulator array
  and the granularity varies based on the element size, i.e.
    - ZAQ0..ZAQ15 (smallest tile granule)
    - ZAD0..ZAD7
    - ZAS0..ZAS3
    - ZAH0..ZAH1
    or ZAB0       (largest tile granule, single tile)
  * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v'
  to tell how the vector at [reg+offset] is layed out in the tile,
  horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds
  to vectors in registers ZAH1 and ZAQ15, respectively.
  * Accumulator matrix: this is the entire accumulator array ZA.

This patch adds the register classes and related operands and parsing
for SME instructions operating on the accumulator array.

The ADDHA and ADDVA instructions which operate on tiles are also added
in this patch to make some use of the code added, later patches will
make use of the other operands introduced here.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Co-authored by: Sander de Smalen (@sdesmalen)

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D105570
2021-07-14 08:25:49 +00:00
Jinsong Ji 64785ac12e [AIX] Update testcase to use aix triple
We have implemented the basic MCAsmParser now, we can use the triple
directly now.
2021-07-14 03:32:37 +00:00
Hafiz Abid Qadeer b205f2bb89 [AMDGPU] Handle s_branch to another section.
Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label.  Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned.

This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error.  Jumps to other section can arise with hold/cold splitting.

A patch to handle the relocation in lld will follow shortly.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105760
2021-07-13 12:17:47 +01:00
Cullen Rhodes 9e42675103 [AArch64] Add target features for Armv9-A Scalable Matrix Extension (SME)
First patch in a series adding MC layer support for the Arm Scalable
Matrix Extension.

This patch adds the following features:

    sme, sme-i64, sme-f64

The sme-i64 and sme-f64 flags are for the optional I16I64 and F64F64
features.

If a target supports I16I64 then the following instructions are
implemented:

  * 64-bit integer ADDHA and ADDVA variants (D105570).
  * SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA, UMOPS, USMOPA, and USMOPS
    instructions that accumulate 16-bit integer outer products into 64-bit
    integer tiles.

If a target supports F64F64 then the FMOPA and FMOPS instructions that
accumulate double-precision floating-point outer products into
double-precision tiles are implemented.

Outer products are implemented in D105571.

The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2021-06

Reviewed By: CarolineConcatto

Differential Revision: https://reviews.llvm.org/D105569
2021-07-12 13:28:10 +00:00
Wouter van Oortmerssen 9647a6f719 [WebAssembly] Added initial type checker to MC Assembler
This to protect against non-sensical instruction sequences being assembled,
which would either cause asserts/crashes further down, or a Wasm module being output that doesn't validate.

Unlike a validator, this type checker is able to give type-errors as part of the parsing process, which makes the assembler much friendlier to be used by humans writing manual input.

Because the MC system is single pass (instructions aren't even stored in MC format, they are directly output) the type checker has to be single pass as well, which means that from now on .globaltype and .functype decls must come before their use. An extra pass is added to Codegen to collect information for this purpose, since AsmPrinter is normally single pass / streaming as well, and would otherwise generate this information on the fly.

A `-no-type-check` flag was added to llvm-mc (and any other tools that take asm input) that surpresses type errors, as a quick escape hatch for tests that were not intended to be type correct.

This is a first version of the type checker that ignores control flow, i.e. it checks that types are correct along the linear path, but not the branch path. This will still catch most errors. Branch checking could be added in the future.

Differential Revision: https://reviews.llvm.org/D104945
2021-07-09 14:07:25 -07:00
Anirudh Prasad 7bc1baea6e [MCParser][z/OS] Mark a few tests as unsupported for the z/OS Target
- Background here is that that these sets of tests are "invalid" to be run on z/OS
- The reason is because these test constructs that HLASM never supports (HLASM doesn't support GNU style directives)
- Usually tests are geared towards a particular target via the use of a triple that targets just that platform, but these tests require the use of a "default triple"
- Thus, we mark these tests as "UNSUPPORTED" for z/OS since we don't want to run these for z/OS

Reviewed By: yusra.syeda, abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D105204
2021-07-05 11:06:52 -04:00
Jinsong Ji bf64210fd8 [AIX] Add dummy XCOFF MCAsmParserExtension
Implement XCOFFMCAsmParser so that we can use MC to parse inline asm.

The directives and storage mapping classes will be added later
iteratively.

Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D105259
2021-07-02 16:12:21 +00:00
Igor Kudrin 657e067bb5 [ARMInstPrinter] Print the target address of a branch instruction
This follows other patches that changed printing immediate values of
branch instructions to target addresses, see D76580 (x86), D76591 (PPC),
D77853 (AArch64).

As observing immediate values might sometimes be useful, they are
printed as comments for branch instructions.

// llvm-objdump -d output (before)
000200b4 <_start>:
   200b4: ff ff ff fa   blx     #-4 <thumb>
000200b8 <thumb>:
   200b8: ff f7 fc ef   blx     #-8 <_start>

// llvm-objdump -d output (after)
000200b4 <_start>:
   200b4: ff ff ff fa   blx     0x200b8 <thumb>         @ imm = #-4
000200b8 <thumb>:
   200b8: ff f7 fc ef   blx     0x200b4 <_start>        @ imm = #-8

// GNU objdump -d.
000200b4 <_start>:
   200b4:       faffffff        blx     200b8 <thumb>
000200b8 <thumb>:
   200b8:       f7ff effc       blx     200b4 <_start>

Differential Revision: https://reviews.llvm.org/D104701
2021-06-30 16:35:28 +07:00
Fangrui Song a9854045f6 [test] Change -t to --syms and -s to -S for llvm-readobj RUN lines
-s and -t will be changed to improve consistency with llvm-readelf.
The inconsistency issue regularly contributes to confusion using the two tools.
2021-06-29 11:50:31 -07:00
Soham Dixit 51d969dc27 [DebugInfo] Bug 41152 - Improve dumping of empty location expressions
Fixes PR41152 (https://bugs.llvm.org/show_bug.cgi?id=41152).

Reviewed by: jhenderson, dblaikie, SouraVX

Differential Revision: https://reviews.llvm.org/D103502
2021-06-29 09:21:00 +01:00
David Spickett 558d9e8228 [llvm][ARM] Treat xscale arch as an alias of armv5te
Previously xscale was known to everything apart
from the ELF streamer so we would crash as soon
as you tried to output an object file.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D104776
2021-06-28 15:20:24 +00:00
Lucas Prates 88b1135e72 [Aarch64] Adding support for Armv9-A Realm Management Extension
This adds support for Armv9-A's Realm Management Extension, including
three new system registers - MFAR_EL3, GPCCR_EL3 and GPTBR_EL3 - and
four new TLBI instructions.

The reference for the Realm Management Extension can be found at: https://developer.arm.com/documentation/ddi0615/aa.

Based on patches by Victor Campos.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D104773
2021-06-28 13:45:22 +01:00
Igor Kudrin e7fffa6f03 [llvm-objdump] Prefix memory operand addresses with '0x'
This helps to avoid ambiguity when the address contains only digits 0..9.

Differential Revision: https://reviews.llvm.org/D104909
2021-06-28 14:25:21 +07:00
Ulrich Weigand b2674670f2 [SystemZ] Add support for .reloc assembler directive
Add support for the .reloc directive along the lines of
other back-ends.

This fixes a regression after https://reviews.llvm.org/D104080
was merged, since that patch presupposed support for .reloc.
2021-06-25 21:51:10 +02:00
Fangrui Song ca3bdb57fa [MC][ELF] Change SHT_LLVM_CALL_GRAPH_PROFILE relocations from SHT_RELA to SHT_REL
... even on targets preferring RELA. The section is only consumed by ld.lld
which can handle REL.

Follow-up to D104080 as I explained in the review. There are two advantages:

* The D104080 code only handles RELA, so arm/i386/mips32 etc may warn for -fprofile-use=/-fprofile-sample-use= usage.
* Decrease object file size for RELA targets

While here, change the relocation to relocate weights, instead of 0,1,2,3,..
I failed to catch the issue during review.
2021-06-24 21:35:48 -07:00
Aakanksha Patil 3453f3dd46 [AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
2021-06-24 14:32:41 -04:00
Alexander Yermolovich a224c5199b [LLD][LLVM] CG Graph profile using relocations
Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index".

This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied.

With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct.

Doing a quick experiment with clang-13.
The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D104080
2021-06-24 09:09:33 -07:00
Fangrui Song c618692218 [AArch64][X86] Allow 64-bit label differences lower to IMAGE_REL_*_REL32
`IMAGE_REL_ARM64_REL64/IMAGE_REL_AMD64_REL64` do not exist and `.quad a - .` is
currently not representable.

For instrumentation, `.quad a - .` is useful representing a cross-section
reference in a metadata section, to allow ELF medium/large code models. The COFF
limitation makes such generic instrumentations inconvenient. I plan to make a
PGO/coverage metadata section field relative in D104556.

Differential Revision: https://reviews.llvm.org/D104564
2021-06-21 14:32:25 -07:00
Saleem Abdulrasool b30bc8cc5d RISCV: simplify a test case for RISCV (NFCI)
The output of the object file is unimportant and entirely discarded.
Simply redirect the output to `/dev/null` or `NUL` as the case may be.
Additionally, the space between the labels is unimportant.  There is no
need to add space between the labels.  Two labels at the same address
are sufficient to generate the difference expression and should still
test the same behaviour.
2021-06-18 08:19:16 -07:00
Heejin Ahn 1d891d44f3 [WebAssembly] Rename event to tag
We recently decided to change 'event' to 'tag', and 'event section' to
'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.

See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D104423
2021-06-17 20:34:19 -07:00
Saleem Abdulrasool 116841c623 RISCV: clean up target expression handling
The target specific expression handling was slightly regressed by
bbea64250f.  This restores the proper
sub-expression evaluation to allow for constant folding within the
expression.  We explicitly discard the layout and assembler when
evaluating the expression to avoid any symbolic computation and instead
using the `evaluateAsRelocatable` to canonicalise and constant fold
only.

We can also simplify the expression handling - none of the target
variants support symbolic difference.  This simplifies the logic for
that and adds additional tests to ensure that we do not accidentally
regress here in the future.

Reviewed By: maskray

Differential Revision: https://reviews.llvm.org/D104473
2021-06-17 13:35:32 -07:00
Saleem Abdulrasool bbea64250f RISCV: adjust handling of relocation emission for RISCV
This re-architects the RISCV relocation handling to bring the
implementation closer in line with the implementation in binutils.  We
would previously aggressively resolve the relocation.  With this
restructuring, we always will emit a paired relocation for any symbolic
difference of the type of S±T[±C] where S and T are labels and C is a
constant.

GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE`
which indicates that a fixup may be expanded into multiple relocations.
This is used by the RISCV backend to always emit a paired relocation -
either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] +
SUB[WIDTH] for a debug info relocation.  Irrespective of whether linker
relaxation support is enabled, symbolic difference is always emitted as
a paired relocation.

This change also sinks the target specific behaviour down into the
target specific area rather than exposing it to the shared relocation
handling.  In the process, we also sink the "special" handling for debug
information down into the RISCV target.  Although this improves the path
for the other targets, this is not necessarily entirely ideal either.
The changes in the debug info emission could be done through another
type of hook as this functionality would be required by any other target
which wishes to do linker relaxation.  However, as there are no other
targets in LLVM which currently do this, this is a reasonable thing to
do until such time as the code needs to be shared.

Improve the handling of the relocation (and add a reduced test case from
the Linux kernel) to ensure that we handle complex expressions for
symbolic difference.  This ensures that we correct relocate symbols with
the adddends normalized and associated with the addition portion of the
paired relocation.

This change also addresses some review comments from Alex Bradbury about
the relocations meant for use in the DWARF CFA being named incorrectly
(using ADD6 instead of SET6) in the original change which introduced the
relocation type.

This resolves the issues with the symbolic difference emission
sufficiently to enable building the Linux kernel with clang+IAS+lld
(without linker relaxation).

Resolves PR50153, PR50156!
Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143

Reviewed By: nickdesaulniers, maskray

Differential Revision: https://reviews.llvm.org/D103539
2021-06-17 08:20:02 -07:00