llvm-project

Commit Graph

Author	SHA1	Message	Date
Cullen Rhodes	3a349d2269	[AArch64][SME] Introduce feature for streaming mode The Scalable Matrix Extension (SME) introduces a new execution mode called Streaming SVE mode. In streaming mode a substantial subset of the SVE and SVE2 instruction set is available, along with new outer product, load, store, extract and insert instructions that operate on the new architectural register state for the matrix. To support streaming mode this patch introduces a new subtarget feature +streaming-sve. If enabled, the subset of SVE(2) instructions are available. The existing behaviour for SVE(2) remains unchanged, the subset of instructions that are legal in streaming mode are enabled if either +sve[2] or +streaming-sve is specified. Instructions that are illegal in streaming mode remain predicated on +sve[2]. The SME target feature has been updated to imply +streaming-sve rather than +sve. The following changes are made to the SVE(2) tests: * For instructions that are legal in streaming mode: - added RUN line to verify +streaming-sve enables the instruction. - updated diagnostic to 'instruction requires: streaming-sve or sve'. * For instructions that are illegal in streaming-mode: - added RUN line to verify +streaming-sve does not enable the instruction. SVE(2) instructions that are legal in streaming mode have: if !HaveSVE[2]() && !HaveSME() then UNDEFINED; at the top of the pseudocode in the XML. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D106272	2021-07-30 07:30:45 +00:00
Mark Schimmel	e622c99f30	[ARC] Add norm/normh instructions with disassembly tests Add disassembler support for the NORM and NORMH instructions. These instructions only exist when the ARC processor is configured with the "norm" extension. fferential Revision: https://reviews.llvm.org/D107118	2021-07-29 17:54:52 -07:00
Thomas Johnson	cc238a6e03	[ARC] Add additional mov immediate instruction formats with a fix for u6 decoding Differential Revision: https://reviews.llvm.org/D107088	2021-07-29 16:41:55 -07:00
Cullen Rhodes	2e27c4e1f1	[AArch64][SME] Add zero instruction This patch adds the zero instruction for zeroing a list of 64-bit element ZA tiles. The instruction takes a list of up to eight tiles ZA0.D-ZA7.D, which must be in order, e.g. zero {za0.d,za1.d,za2.d,za3.d,za4.d,za5.d,za6.d,za7.d} zero {za1.d,za3.d,za5.d,za7.d} The assembler also accepts 32-bit, 16-bit and 8-bit element tiles which are mapped to corresponding 64-bit element tiles in accordance with the architecturally defined mapping between different element size tiles, e.g. * Zeroing ZA0.B, or the entire array name ZA, is equivalent to zeroing all eight 64-bit element tiles ZA0.D to ZA7.D. * Zeroing ZA0.S is equivalent to zeroing ZA0.D and ZA4.D. The preferred disassembly of this instruction uses the shortest list of tile names that represent the encoded immediate mask, e.g. * An immediate which encodes 64-bit element tiles ZA0.D, ZA1.D, ZA4.D and ZA5.D is disassembled as {ZA0.S, ZA1.S}. * An immediate which encodes 64-bit element tiles ZA0.D, ZA2.D, ZA4.D and ZA6.D is disassembled as {ZA0.H}. * An all-ones immediate is disassembled as {ZA}. * An all-zeros immediate is disassembled as an empty list {}. This patch adds the MatrixTileList asm operand and related parsing to support this. Depends on D105570. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105575	2021-07-27 08:35:45 +00:00
Lei Huang	64a15817a0	[PowerPC]Add addex instruction definition and MC tests Add td definitions and asm/disasm tests for the addex instruction introduced in ISA 3.0. Reviewed By: nemanjai, amyk, NeHuang Differential Revision: https://reviews.llvm.org/D106666	2021-07-26 14:55:38 -05:00
Michael Liao	b0402a35fc	[amdgpu] Add 64-bit PC support when expanding unconditional branches. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D106445	2021-07-26 14:50:30 -04:00
Ulrich Weigand	8cd8120a7b	[SystemZ] Add support for new cpu architecture - arch14 This patch adds support for the next-generation arch14 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Detection of arch14 as host processor. - Assembler/disassembler support for new instructions. - New LLVM intrinsics for certain new instructions. - Support for low-level builtins mapped to new LLVM intrinsics. - New high-level intrinsics in vecintrin.h. - Indicate support by defining __VEC__ == 10304. Note: No currently available Z system supports the arch14 architecture. Once new systems become available, the official system name will be added as supported -march name.	2021-07-26 16:57:28 +02:00
Cullen Rhodes	e6ff9179ce	[AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc() Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D106635	2021-07-26 09:36:34 +00:00
Thomas Johnson	51d8e67e88	[ARC] Add tablegen definition for the Find Leading Set (FLS) instruction Differential Revision: https://reviews.llvm.org/D106602	2021-07-22 17:42:25 -07:00
Thomas Johnson	1cda1e6186	[ARC] Add disassembly for the conditioned RSUB immediate instruction Differential Revision: https://reviews.llvm.org/D106497	2021-07-22 11:34:39 -07:00
Cullen Rhodes	00e87e1c5b	[AArch64][SME] Improve diagnostic for vector select register Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D106540	2021-07-22 13:46:40 +00:00
Simon Tatham	bd41136746	[clang] Use i64 for the !srcloc metadata on asm IR nodes. This is part of a patch series working towards the ability to make SourceLocation into a 64-bit type to handle larger translation units. !srcloc is generated in clang codegen, and pulled back out by llvm functions like AsmPrinter::emitInlineAsm that need to report errors in the inline asm. From there it goes to LLVMContext::emitError, is stored in DiagnosticInfoInlineAsm, and ends up back in clang, at BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true clang::SourceLocation from the integer cookie. Throughout this code path, it's now 64-bit rather than 32, which means that if SourceLocation is expanded to a 64-bit type, this error report won't lose half of the data. The compiler will tolerate both of i32 and i64 !srcloc metadata in input IR without faulting. Test added in llvm/MC. (The semantic accuracy of the metadata is another matter, but I don't know of any situation where that matters: if you're reading an IR file written by a previous run of clang, you don't have the SourceManager that can relate those source locations back to the original source files.) Original version of the patch by Mikhail Maltsev. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D105491	2021-07-22 10:24:52 +01:00
Carl Ritson	6efb3220b4	[AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr. Previously these were rounded up to VReg_256 this saves VGPRs. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D103800	2021-07-22 10:42:15 +09:00
Cullen Rhodes	008c755d76	[AArch64][SME] Support .arch and .arch_extension assembler directives Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105566	2021-07-21 08:40:27 +00:00
Cullen Rhodes	2d80bbd939	[AArch64][SME] Add mova instructions This patch adds the mova instruction to insert/extract an SVE vector register to/from a ZA tile vector. The preferred MOV aliases are also implemented. Depends on D105572. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm, CarolineConcatto Differential Revision: https://reviews.llvm.org/D105574	2021-07-21 08:20:01 +00:00
Cullen Rhodes	6c32cfe85c	[AArch64][SME] Add ldr and str instructions The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D105573	2021-07-21 08:17:13 +00:00
Craig Topper	81efb82570	[RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants. If we need to shift left anyway we might be able to take advantage of LUI implicitly shifting its immediate left by 12 to cover part of the shift. This allows us to use more bits of the LUI immediate to avoid an ADDI. isDesirableToCommuteWithShift now considers compressed instruction opportunities when deciding if commuting should be allowed. I believe this is the same or similar to one of the optimizations from D79492. Reviewed By: luismarques, arcbbb Differential Revision: https://reviews.llvm.org/D105417	2021-07-20 09:22:06 -07:00
Cullen Rhodes	15af3aaa2e	[AArch64][SME] Add system registers and related instructions This patch adds the new system registers introduced in SME: - ID_AA64SMFR0_EL1 (ro) SME feature identifier. - SMCR_ELx (r/w) streaming mode control register for configuring effective SVE Streaming SVE Vector length when the PE is in Streaming SVE mode. - SVCR (r/w) streaming vector control register, visible at all exception levels. Provides access to PSTATE.SM and PSTATE.ZA using MSR and MRS instructions. - SMPRI_EL1 (r/w) streaming mode execution priority register. - SMPRIMAP_EL2 (r/w) streaming mode priority mapping register. - SMIDR_EL1 (ro) streaming mode identification register. - TPIDR2_EL0 (r/w) for use by SME software to manage per-thread SME context. - MPAMSM_EL1 (r/w) MPAM (v8.4) streaming mode register, for labelling memory accesses performed in streaming mode. Also added in this patch are the SME mode change instructions. Three MSR immediate instructions are implemented to set or clear PSTATE.SM, PSTATE.ZA, or both respectively: - MSR SVCRSM, #<imm1> - MSR SVCRZA, #<imm1> - MSR SVCRSMZA, #<imm1> The following smstart/smstop aliases are also implemented for convenience: smstart -> MSR SVCRSMZA, #1 smstart sm -> MSR SVCRSM, #1 smstart za -> MSR SVCRZA, #1 smstop -> MSR SVCRSMZA, #0 smstop sm -> MSR SVCRSM, #0 smstop za -> MSR SVCRZA, #0 The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105576	2021-07-20 08:06:26 +00:00
Derek Schuff	ad1f5457d2	[WebAssembly] Generate R_WASM_FUNCTION_OFFSET relocs in debuginfo sections Debug info sections need R_WASM_FUNCTION_OFFSET_I32 relocs (with FK_Data_4 fixup kinds) to refer to functions (instead of R_WASM_TABLE_INDEX as is used in data sections). Usually this is done in a convoluted way, with unnamed temp data symbols which target the start of the function, in which case WasmObjectWriter::recordRelocation converts it to use the section symbol instead. However in some cases the function can actually be undefined; in this case the dwarf generator uses the function symbol (a named undefined function symbol) instead. In that case the section-symbol transform doesn't work and we need to generate the correct reloc type a different way. In this change WebAssemblyWasmObjectWriter::getRelocType takes the fixup section type into account to choose the correct reloc type. Fixes PR50408 Differential Revision: https://reviews.llvm.org/D103557	2021-07-19 14:02:33 -07:00
Wouter van Oortmerssen	670944fb20	[WebAssembly] Support R_WASM_MEMORY_ADDR_TLS_SLEB64 for wasm64 Also fixed TLS tests swapping addr & value in store op Differential Revision: https://reviews.llvm.org/D106096	2021-07-19 10:22:43 -07:00
Cullen Rhodes	f91eaa7007	[AArch64][SME] Add SVE2 instructions added in SME This patch adds support for the following instructions: SCLAMP, UCLAMP, REV, DUP (predicate) The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D105577	2021-07-19 08:03:05 +00:00
Craig Topper	4dbb788068	[RISCV] Teach constant materialization that it can use zext.w at the end with Zba to reduce number of instructions. If the upper 32 bits are zero and bit 31 is set, we might be able to use zext.w to fill in the zeros after using an lui and/or addi. Most of this patch is plumbing the subtarget features into the constant materialization. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D105509	2021-07-16 09:35:56 -07:00
Cullen Rhodes	99eb96f031	[AArch64][SME] Add load and store instructions This patch adds support for following contiguous load and store instructions: * LD1B, LD1H, LD1W, LD1D, LD1Q * ST1B, ST1H, ST1W, ST1D, ST1Q A new register class and operand is added for the 32-bit vector select register W12-W15. The differences in the following tests which have been re-generated are caused by the introduction of this register class: * llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll * llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir * llvm/test/CodeGen/AArch64/stp-opt-with-renaming-reserved-regs.mir * llvm/test/CodeGen/AArch64/stp-opt-with-renaming.mir D88663 attempts to resolve the issue with the store pair test differences in the AArch64 load/store optimizer. The GlobalISel differences are caused by changes in the enum values of register classes, tests have been updated with the new values. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: CarolineConcatto Differential Revision: https://reviews.llvm.org/D105572	2021-07-16 10:11:10 +00:00
Harald van Dijk	a8ad917054	[X86] Fix handling of maskmovdqu in X32 The maskmovdqu instruction is an odd one: it has a 32-bit and a 64-bit variant, the former using EDI, the latter RDI, but the use of the register is implicit. In 64-bit mode, a 0x67 prefix can be used to get the version using EDI, but there is no way to express this in assembly in a single instruction, the only way is with an explicit addr32. This change adds support for the instruction. When generating assembly text, that explicit addr32 will be added. When not generating assembly text, it will be kept as a single instruction and will be emitted with that 0x67 prefix. When parsing assembly text, it will be re-parsed as ADDR32 followed by MASKMOVDQU64, which still results in the correct bytes when converted to machine code. The same applies to vmaskmovdqu as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103427	2021-07-15 22:56:08 +01:00
Fangrui Song	aa3df8ddcd	[test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers)	2021-07-15 10:26:21 -07:00
Cullen Rhodes	dfa76933c2	[AArch64][SME] Add outer product instructions This patch adds support for the following outer product instructions: * BFMOPA, BFMOPS, FMOPA, FMOPS, SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA, UMOPS, USMOPA, USMOPS. Depends on D105570. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105571	2021-07-15 09:51:06 +00:00
Thomas Lively	122b0220fd	[WebAssembly] Remove datalayout strings from llc tests The data layout strings do not have any effect on llc tests and will become misleadingly out of date as we continue to update the canonical data layout, so remove them from the tests. Differential Revision: https://reviews.llvm.org/D105842	2021-07-14 11:17:08 -07:00
Jinsong Ji	fe52296a34	[AIX] Enable dollar sign as PC in inlineasm $ is used as PC for PowerPC inlineasm, ELF use it, enable it for AIX XCOFF as well. Reviewed By: #powerpc, amyk, nemanjai Differential Revision: https://reviews.llvm.org/D105956	2021-07-14 13:37:52 +00:00
Cullen Rhodes	c08dabb0f4	[AArch64][SME] Add matrix register definitions and parsing support SME introduces the ZA array, a new piece of architectural register state consisting of a matrix of [SVLb x SVLb] bytes, where SVL is the implementation defined Streaming SVE vector length and SVLb is the number of 8-bit elements in a vector of SVL bits. SME instructions consist of three types of matrix operands: * Tiles: a ZA tile is a square, two-dimensional sub-array of elements within the ZA array. These tiles make up the larger accumulator array and the granularity varies based on the element size, i.e. - ZAQ0..ZAQ15 (smallest tile granule) - ZAD0..ZAD7 - ZAS0..ZAS3 - ZAH0..ZAH1 or ZAB0 (largest tile granule, single tile) * Tile vectors: similar to regular tiles, but have an extra 'h' or 'v' to tell how the vector at [reg+offset] is layed out in the tile, horizontally or vertically. E.g. za1h.h or za15v.q, which corresponds to vectors in registers ZAH1 and ZAQ15, respectively. * Accumulator matrix: this is the entire accumulator array ZA. This patch adds the register classes and related operands and parsing for SME instructions operating on the accumulator array. The ADDHA and ADDVA instructions which operate on tiles are also added in this patch to make some use of the code added, later patches will make use of the other operands introduced here. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Co-authored by: Sander de Smalen (@sdesmalen) Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D105570	2021-07-14 08:25:49 +00:00
Jinsong Ji	64785ac12e	[AIX] Update testcase to use aix triple We have implemented the basic MCAsmParser now, we can use the triple directly now.	2021-07-14 03:32:37 +00:00
Hafiz Abid Qadeer	b205f2bb89	[AMDGPU] Handle s_branch to another section. Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label. Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned. This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error. Jumps to other section can arise with hold/cold splitting. A patch to handle the relocation in lld will follow shortly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D105760	2021-07-13 12:17:47 +01:00
Cullen Rhodes	9e42675103	[AArch64] Add target features for Armv9-A Scalable Matrix Extension (SME) First patch in a series adding MC layer support for the Arm Scalable Matrix Extension. This patch adds the following features: sme, sme-i64, sme-f64 The sme-i64 and sme-f64 flags are for the optional I16I64 and F64F64 features. If a target supports I16I64 then the following instructions are implemented: * 64-bit integer ADDHA and ADDVA variants (D105570). * SMOPA, SMOPS, SUMOPA, SUMOPS, UMOPA, UMOPS, USMOPA, and USMOPS instructions that accumulate 16-bit integer outer products into 64-bit integer tiles. If a target supports F64F64 then the FMOPA and FMOPS instructions that accumulate double-precision floating-point outer products into double-precision tiles are implemented. Outer products are implemented in D105571. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-06 Reviewed By: CarolineConcatto Differential Revision: https://reviews.llvm.org/D105569	2021-07-12 13:28:10 +00:00
Wouter van Oortmerssen	9647a6f719	[WebAssembly] Added initial type checker to MC Assembler This to protect against non-sensical instruction sequences being assembled, which would either cause asserts/crashes further down, or a Wasm module being output that doesn't validate. Unlike a validator, this type checker is able to give type-errors as part of the parsing process, which makes the assembler much friendlier to be used by humans writing manual input. Because the MC system is single pass (instructions aren't even stored in MC format, they are directly output) the type checker has to be single pass as well, which means that from now on .globaltype and .functype decls must come before their use. An extra pass is added to Codegen to collect information for this purpose, since AsmPrinter is normally single pass / streaming as well, and would otherwise generate this information on the fly. A `-no-type-check` flag was added to llvm-mc (and any other tools that take asm input) that surpresses type errors, as a quick escape hatch for tests that were not intended to be type correct. This is a first version of the type checker that ignores control flow, i.e. it checks that types are correct along the linear path, but not the branch path. This will still catch most errors. Branch checking could be added in the future. Differential Revision: https://reviews.llvm.org/D104945	2021-07-09 14:07:25 -07:00
Anirudh Prasad	7bc1baea6e	[MCParser][z/OS] Mark a few tests as unsupported for the z/OS Target - Background here is that that these sets of tests are "invalid" to be run on z/OS - The reason is because these test constructs that HLASM never supports (HLASM doesn't support GNU style directives) - Usually tests are geared towards a particular target via the use of a triple that targets just that platform, but these tests require the use of a "default triple" - Thus, we mark these tests as "UNSUPPORTED" for z/OS since we don't want to run these for z/OS Reviewed By: yusra.syeda, abhina.sreeskantharajan Differential Revision: https://reviews.llvm.org/D105204	2021-07-05 11:06:52 -04:00
Jinsong Ji	bf64210fd8	[AIX] Add dummy XCOFF MCAsmParserExtension Implement XCOFFMCAsmParser so that we can use MC to parse inline asm. The directives and storage mapping classes will be added later iteratively. Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D105259	2021-07-02 16:12:21 +00:00
Igor Kudrin	657e067bb5	[ARMInstPrinter] Print the target address of a branch instruction This follows other patches that changed printing immediate values of branch instructions to target addresses, see D76580 (x86), D76591 (PPC), D77853 (AArch64). As observing immediate values might sometimes be useful, they are printed as comments for branch instructions. // llvm-objdump -d output (before) 000200b4 <_start>: 200b4: ff ff ff fa blx #-4 <thumb> 000200b8 <thumb>: 200b8: ff f7 fc ef blx #-8 <_start> // llvm-objdump -d output (after) 000200b4 <_start>: 200b4: ff ff ff fa blx 0x200b8 <thumb> @ imm = #-4 000200b8 <thumb>: 200b8: ff f7 fc ef blx 0x200b4 <_start> @ imm = #-8 // GNU objdump -d. 000200b4 <_start>: 200b4: faffffff blx 200b8 <thumb> 000200b8 <thumb>: 200b8: f7ff effc blx 200b4 <_start> Differential Revision: https://reviews.llvm.org/D104701	2021-06-30 16:35:28 +07:00
Fangrui Song	a9854045f6	[test] Change -t to --syms and -s to -S for llvm-readobj RUN lines -s and -t will be changed to improve consistency with llvm-readelf. The inconsistency issue regularly contributes to confusion using the two tools.	2021-06-29 11:50:31 -07:00
Soham Dixit	51d969dc27	[DebugInfo] Bug 41152 - Improve dumping of empty location expressions Fixes PR41152 (https://bugs.llvm.org/show_bug.cgi?id=41152). Reviewed by: jhenderson, dblaikie, SouraVX Differential Revision: https://reviews.llvm.org/D103502	2021-06-29 09:21:00 +01:00
David Spickett	558d9e8228	[llvm][ARM] Treat xscale arch as an alias of armv5te Previously xscale was known to everything apart from the ELF streamer so we would crash as soon as you tried to output an object file. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D104776	2021-06-28 15:20:24 +00:00
Lucas Prates	88b1135e72	[Aarch64] Adding support for Armv9-A Realm Management Extension This adds support for Armv9-A's Realm Management Extension, including three new system registers - MFAR_EL3, GPCCR_EL3 and GPTBR_EL3 - and four new TLBI instructions. The reference for the Realm Management Extension can be found at: https://developer.arm.com/documentation/ddi0615/aa. Based on patches by Victor Campos. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D104773	2021-06-28 13:45:22 +01:00
Igor Kudrin	e7fffa6f03	[llvm-objdump] Prefix memory operand addresses with '0x' This helps to avoid ambiguity when the address contains only digits 0..9. Differential Revision: https://reviews.llvm.org/D104909	2021-06-28 14:25:21 +07:00
Ulrich Weigand	b2674670f2	[SystemZ] Add support for .reloc assembler directive Add support for the .reloc directive along the lines of other back-ends. This fixes a regression after https://reviews.llvm.org/D104080 was merged, since that patch presupposed support for .reloc.	2021-06-25 21:51:10 +02:00
Fangrui Song	ca3bdb57fa	[MC][ELF] Change SHT_LLVM_CALL_GRAPH_PROFILE relocations from SHT_RELA to SHT_REL ... even on targets preferring RELA. The section is only consumed by ld.lld which can handle REL. Follow-up to D104080 as I explained in the review. There are two advantages: * The D104080 code only handles RELA, so arm/i386/mips32 etc may warn for -fprofile-use=/-fprofile-sample-use= usage. * Decrease object file size for RELA targets While here, change the relocation to relocate weights, instead of 0,1,2,3,.. I failed to catch the issue during review.	2021-06-24 21:35:48 -07:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Alexander Yermolovich	a224c5199b	[LLD][LLVM] CG Graph profile using relocations Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index". This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied. With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct. Doing a quick experiment with clang-13. The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D104080	2021-06-24 09:09:33 -07:00
Fangrui Song	c618692218	[AArch64][X86] Allow 64-bit label differences lower to IMAGE_REL_*_REL32 `IMAGE_REL_ARM64_REL64/IMAGE_REL_AMD64_REL64` do not exist and `.quad a - .` is currently not representable. For instrumentation, `.quad a - .` is useful representing a cross-section reference in a metadata section, to allow ELF medium/large code models. The COFF limitation makes such generic instrumentations inconvenient. I plan to make a PGO/coverage metadata section field relative in D104556. Differential Revision: https://reviews.llvm.org/D104564	2021-06-21 14:32:25 -07:00
Saleem Abdulrasool	b30bc8cc5d	RISCV: simplify a test case for RISCV (NFCI) The output of the object file is unimportant and entirely discarded. Simply redirect the output to `/dev/null` or `NUL` as the case may be. Additionally, the space between the labels is unimportant. There is no need to add space between the labels. Two labels at the same address are sufficient to generate the difference expression and should still test the same behaviour.	2021-06-18 08:19:16 -07:00
Heejin Ahn	1d891d44f3	[WebAssembly] Rename event to tag We recently decided to change 'event' to 'tag', and 'event section' to 'tag section', out of the rationale that the section contains a generalized tag that references a type, which may be used for something other than exceptions, and the name 'event' can be confusing in the web context. See - https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130 - https://github.com/WebAssembly/exception-handling/pull/161 Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D104423	2021-06-17 20:34:19 -07:00
Saleem Abdulrasool	116841c623	RISCV: clean up target expression handling The target specific expression handling was slightly regressed by `bbea64250f`. This restores the proper sub-expression evaluation to allow for constant folding within the expression. We explicitly discard the layout and assembler when evaluating the expression to avoid any symbolic computation and instead using the `evaluateAsRelocatable` to canonicalise and constant fold only. We can also simplify the expression handling - none of the target variants support symbolic difference. This simplifies the logic for that and adds additional tests to ensure that we do not accidentally regress here in the future. Reviewed By: maskray Differential Revision: https://reviews.llvm.org/D104473	2021-06-17 13:35:32 -07:00
Saleem Abdulrasool	bbea64250f	RISCV: adjust handling of relocation emission for RISCV This re-architects the RISCV relocation handling to bring the implementation closer in line with the implementation in binutils. We would previously aggressively resolve the relocation. With this restructuring, we always will emit a paired relocation for any symbolic difference of the type of S±T[±C] where S and T are labels and C is a constant. GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE` which indicates that a fixup may be expanded into multiple relocations. This is used by the RISCV backend to always emit a paired relocation - either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] + SUB[WIDTH] for a debug info relocation. Irrespective of whether linker relaxation support is enabled, symbolic difference is always emitted as a paired relocation. This change also sinks the target specific behaviour down into the target specific area rather than exposing it to the shared relocation handling. In the process, we also sink the "special" handling for debug information down into the RISCV target. Although this improves the path for the other targets, this is not necessarily entirely ideal either. The changes in the debug info emission could be done through another type of hook as this functionality would be required by any other target which wishes to do linker relaxation. However, as there are no other targets in LLVM which currently do this, this is a reasonable thing to do until such time as the code needs to be shared. Improve the handling of the relocation (and add a reduced test case from the Linux kernel) to ensure that we handle complex expressions for symbolic difference. This ensures that we correct relocate symbols with the adddends normalized and associated with the addition portion of the paired relocation. This change also addresses some review comments from Alex Bradbury about the relocations meant for use in the DWARF CFA being named incorrectly (using ADD6 instead of SET6) in the original change which introduced the relocation type. This resolves the issues with the symbolic difference emission sufficiently to enable building the Linux kernel with clang+IAS+lld (without linker relaxation). Resolves PR50153, PR50156! Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143 Reviewed By: nickdesaulniers, maskray Differential Revision: https://reviews.llvm.org/D103539	2021-06-17 08:20:02 -07:00

1 2 3 4 5 ...

8315 Commits