Aligning section contents is not required, but only
recommended, by the specification. Microsoft's documentation says
(https://docs.microsoft.com/en-us/windows/desktop/debug/pe-format#section-table-section-headers):
"For object files, the value should be aligned on a 4-byte boundary
for best performance."
However, according to my measurements, aligning section contents has
a neutral to negative effect on performance.
I measured the median run time of 100 links of Chromium's
base_unittests on Linux with lld-link and on Windows with link.exe with
both aligned and unaligned sections. On Linux I didn't see a measurable
performance difference, and on Windows the link was slightly faster
with unaligned sections (presumably because on Windows the bottleneck
is I/O).
Also, the sections created by cl.exe are unaligned, so we should expect
tools to broadly accept unaligned sections.
Differential Revision: https://reviews.llvm.org/D51149
llvm-svn: 340514
The format is the same as in ELF: a sequence of ULEB128-encoded
symbol indexes.
Differential Revision: https://reviews.llvm.org/D51047
llvm-svn: 340499
This adds the plumbing for the Tiny code model for the AArch64 backend. This,
instead of loading addresses through the normal ADRP;ADD pair used in the Small
model, uses a single ADR. The 21 bit range of an ADR means that the code and
its statically defined symbols need to be within 1MB of each other.
This makes it mostly interesting for embedded applications where we want to fit
as much as we can in as small a space as possible.
Differential Revision: https://reviews.llvm.org/D49673
llvm-svn: 340397
Summary:
This CL implements v128.const for each vector type. New operand types
are added to ensure the vector contents can be serialized without LEB
encoding. Tests are added for instruction selection, encoding,
assembly and disassembly.
Reviewers: aheejin, dschuff, aardappel
Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D50873
llvm-svn: 340336
Summary:
This commit adds new intrinsics
llvm.amdgcn.raw.tbuffer.load
llvm.amdgcn.struct.tbuffer.load
llvm.amdgcn.raw.tbuffer.store
llvm.amdgcn.struct.tbuffer.store
with the following changes from the llvm.amdgcn.tbuffer.* intrinsics:
* there are separate raw and struct versions: raw does not have an index
arg and sets idxen=0 in the instruction, and struct always sets
idxen=1 in the instruction even if the index is 0, to allow for the
fact that gfx9 does bounds checking differently depending on whether
idxen is set;
* there is a combined format arg (dfmt+nfmt)
* there is a combined cachepolicy arg (glc+slc)
* there are now only two offset args: one for the offset that is
included in bounds checking and swizzling, to be split between the
instruction's voffset and immoffset fields, and one for the offset
that is excluded from bounds checking and swizzling, to go into the
instruction's soffset field.
The AMDISD::TBUFFER_* SD nodes always have an index operand, all three
offset operands, combined format operand, combined cachepolicy operand,
and an extra idxen operand.
The tbuffer pseudo- and real instructions now also have a combined
format operand.
The obsolescent llvm.amdgcn.tbuffer.* and llvm.SI.tbuffer.store
intrinsics continue to work.
V2: Separate raw and struct intrinsics.
V3: Moved extract_glc and extract_slc defs to a more sensible place.
V4: Rebased on D49995.
V5: Only two separate offset args instead of three.
V6: Pseudo- and real instructions have joint format operand.
V7: Restored optionality of dfmt and nfmt in assembler.
V8: Addressed minor review comments.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D49026
Change-Id: If22ad77e349fac3a5d2f72dda53c010377d470d4
llvm-svn: 340268
This patch adds system registers for controlling aspects of SVE:
- ZCR_EL1 (r/w) visible at EL1 and EL0.
- ZCR_EL2 (r/w) visible at EL2 and Non-secure EL1 and EL0.
- ZCR_EL3 (r/w) visible at all exception levels.
and a system register identifying SVE:
- ID_AA64ZFR0_EL1 (r) SVE Feature identifier.
Reviewers: SjoerdMeijer, samparker, pbarrio, fhahn, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D50885
llvm-svn: 340158
Add +fp16fml feature for new FP16 instructions, which are a
mandatory part of FP16 from v8.4-A and an optional part of FP16
from v8.2-A. It doesn't seem to be possible to model this in
LLVM, but the relationship between the options is handled by
the related clang patch.
In keeping with what I think is the usual practice, the fp16fml
extension is accepted regardless of base architecture version.
Builds on/replaces Sjoerd Meijer's patch to add these instructions at
https://reviews.llvm.org/D49839.
Differential Revision: https://reviews.llvm.org/D50228
llvm-svn: 340013
Handle the case when the symbol is private. Private symbols are not in
the COFF object file symbol table, so they aren't inserted into
SymbolMap. We can't look up the section of the symbol that way. Instead,
get the MCSection from the MCSymbol and map that to the object file
section.
Print a better error message when the symbol has no section, like when
the symbol is undefined.
Fixes PR38607
llvm-svn: 339942
Summary:
This prefix was added in r333421, and it changed our dumper output to
say things like "CVRegEAX" instead of just "EAX". That's a functional
change that I'd rather avoid.
I tested GCC, Clang, and MSVC, and all of them support #pragma
push_macro. They don't issue warnings whem the macro is not defined
either.
I don't have a Mac so I can't test the real termios.h header, but I
looked at the termios.h sources online and looked for other conflicts.
I saw only the CR* macros, so those are the ones we work around.
Reviewers: zturner, JDevlieghere
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D50851
llvm-svn: 339907
Allow the comparison of x86 registers in the evaluation of assembler
directives. This generalizes and simplifies the extension from r334022
to catch another case found in the Linux kernel.
Reviewers: rnk, void
Reviewed By: rnk
Subscribers: hiraditya, nickdesaulniers, llvm-commits
Differential Revision: https://reviews.llvm.org/D50795
llvm-svn: 339895
The behavior in 64-bit mode is different between Intel and AMD CPUs. Intel ignores the 0x66 prefix. AMD does not. objump doesn't ignore the 0x66 prefix. Since LLVM aims to match objdump behavior, we should do the same.
While I was trying to fix this I had change brtarget16/32 to use ENCODING_IW/ID instead of ENCODING_Iv to get the 0x66+REX.W case to act sort of sanely. It's still wrong, but that's a problem for another day.
The change in encoding exposed the fact that 16-bit mode disassembly of relative jumps was creating JMP_4 with a 2 byte immediate. It should have been JMP_2. From just printing you can't tell the difference, but if you dumped the encoding it wouldn't have matched what we started with.
While fixing that, it exposed that jo/jno opcodes were missing from the switch that this patch deleted and there were no test cases for them.
Fixes PR38537.
llvm-svn: 339622
Summary:
Moved Explicit Locals pass to last.
Made that pass obligatory.
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll
tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*
Reviewers: dschuff, sunfish
Subscribers: jfb, llvm-commits, aheejin, eraman, jgravelle-google, sbc100
Differential Revision: https://reviews.llvm.org/D50568
llvm-svn: 339474
This pseudo-instruction is similar to la but uses PC-relative addressing
unconditionally. This is, la is only different to lla when using -fPIC. This
pseudo-instruction seems often forgotten in several specs but it is definitely
mentioned in binutils opcodes/riscv-opc.c. The semantics are defined both in
page 37 of the "RISC-V Reader" book but also in function macro found in
gas/config/tc-riscv.c.
This is a very first step towards adding PIC support for Linux in the RISC-V
backend.
The lla pseudo-instruction expands to a sequence of auipc + addi with a couple
of pc-rel relocations where the second points to the first one. This is
described in
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#pc-relative-symbol-addresses
For now, this patch only introduces support of that pseudo instruction at the
assembler parser.
Differential Revision: https://reviews.llvm.org/D49661
llvm-svn: 339314
On Darwin we pin the DWARF line tables to version 2. Stop doing so for
DWARF v5 and later.
Differential revision: https://reviews.llvm.org/D49381
llvm-svn: 339288
Match the GNU assembler in supporting immediate operands for these
instructions even when the reg-reg mnemonic is used.
Differential Revision: https://reviews.llvm.org/D50046
Patch by Kito Cheng.
llvm-svn: 339252
This matches our behaviour for regular (i.e. relocated) references to
private symbols and therefore avoids needing to unnecessarily write
address-significant .L symbols to the object file's symbol table,
which can interfere with stack traces.
Fixes check-cfi after r339050.
llvm-svn: 339066
ld64 supplies its own Thumb bit for Thumb functions, and intentionally zeroes
out that part of any addend in an object file. But it only does that for
symbols marked N_EXT -- i.e. external symbols. So LLVM should avoid setting
that extra bit in other cases.
llvm-svn: 339007
There are a bunch of edge cases and inconsistencies in how we're emitting sections
cause this warning to fire and it needs more work.
This reverts commit r335558.
llvm-svn: 338968
As a part of adding the tiny codemodel, we need to support ldr's with :got:
relocations on them. This seems to be mostly already done, just needs the
relocation type support.
Differential Revision: https://reviews.llvm.org/D50137
llvm-svn: 338673
Contrary to ELF, we don't add any markers that distinguish data generated
with .short/.long from normal instructions, so the .inst directive only
adds compatibility with assembly that uses it.
Differential Revision: https://reviews.llvm.org/D49936
llvm-svn: 338356
Contrary to ELF, we don't add any markers that distinguish data generated
with .long from normal instructions, so the .inst directive only adds
compatibility with assembly that uses it.
Differential Revision: https://reviews.llvm.org/D49935
llvm-svn: 338355
This patch enables instructions that are destructive on their
destination- and first source operand, to be prefixed with a
MOVPRFX instruction.
This patch also adds a variety of tests:
- positive tests for all instructions and forms that accept a
movprfx for either or both predicated and unpredicated forms.
- negative tests for all instructions and forms that do not accept
an unpredicated or predicated movprfx.
- negative tests for the diagnostics that get emitted when a MOVPRFX
instruction is used incorrectly.
This is patch [2/2] in a series to add MOVPRFX instructions:
- Patch [1/2]: https://reviews.llvm.org/D49592
- Patch [2/2]: https://reviews.llvm.org/D49593
Reviewers: rengolin, SjoerdMeijer, samparker, fhahn, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D49593
llvm-svn: 338261
The WHILE instructions generate a predicate that is true while the
comparison of the first scalar operand (incremented for each predicate
element) with the second scalar operand is true and false thereafter.
WHILELE While incrementing signed scalar less than or equal to scalar
WHILELO While incrementing unsigned scalar lower than scalar
WHILELS While incrementing unsigned scalar lower than or same as scalar
WHILELT While incrementing signed scalar less than scalar
e.g.
whilele p0.s, x0, x1
generates predicate p0 (for 32bit elements) by incrementing
(signed) x0 and comparing that vector to splat(x1).
llvm-svn: 338211
The instructions added in this patch permit active elements within
a vector to be processed sequentially without unpacking the vector.
PFIRST Set the first active element to true.
PNEXT Find next active element in predicate.
CTERMEQ Compare and terminate loop when equal.
CTERMNE Compare and terminate loop when not equal.
llvm-svn: 338210
This patch adds PFALSE (unconditionally sets all elements of
the predicate to false) and PTEST (set the status flags for the
predicate).
llvm-svn: 338198
This patch adds support for instructions that partition a predicate
based on data-dependent termination conditions in a loop.
BRKA Break after the first true condition
BRKAS Break after the first true condition, setting condition flags
BRKB Break before the first true condition
BRKBS Break before the first true condition, setting condition flags
BRKPA Break after the first true condition, propagating from the
previous partition
BRKPAS Break after the first true condition, propagating from the
previous partition, setting condition flags
BRKPB Break before the first true condition, propagating from the
previous partition
BRKPBS Break before the first true condition, propagating from the
previous partition, setting condition flags
BRKN Propagate break to next partition
BKRNS Propagate break to next partition, setting condition flags
llvm-svn: 338196
Summary:
Moved Explicit Locals pass to last.
Made that pass obligatory.
Made it convert from register to stack based instructions, and removed the registers.
Fixes to related code that was expecting register based instructions.
Added the correct testing flag to all tests, depending on what the
format they were expecting so far.
Translated one test to stack format as example: reg-stackify-stack.ll
tested:
llvm-lit -v `find test -name WebAssembly`
unittests/MC/*
Reviewers: dschuff, sunfish
Subscribers: sbc100, jgravelle-google, eraman, aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D49160
llvm-svn: 338164
This patch adds support for various integer reduction operations:
SADDV signed add reduction to scalar
UADDV unsigned add reduction to scalar
SMAXV signed maximum reduction to scalar
SMINV signed minimum reduction to scalar
UMAXV unsigned maximum reduction to scalar
UMINV unsigned minimum reduction to scalar
ANDV logical AND reduction to scalar
ORV logical OR reduction to scalar
EORV logical EOR reduction to scalar
The reduction is predicated, e.g.
smaxv s0, p0, z1.s
performs a signed maximum reduction on active elements in z1,
and stores the (signed max value) result in s0.
llvm-svn: 338126
This patch adds support for various floating-point
reduction operations:
FADDA strictly-ordered add reduction, accumulating in scalar
FADDV recursive add reduction to scalar
FMAXV recursive max reduction to scalar
FMINV recursive min reduction to scalar
FMAXNMV recursive max number reduction to scalar
FMINNMV recursive min number reduction to scalar
The reduction is predicated, e.g.
fadda d0, p0, d0, z1.d
performs the add-reduction in strict order on active elements
in z1, accumulating into d0.
faddv d0, p0, z1.d
performs the add-reduction (not in strict order)
on active elements in z1, storing the result in d0.
llvm-svn: 338123
This patch adds support for transcendental acceleration
instructions 'FEXPA' (exponential accelerator) and 'FTSSEL'
(trigonometric select coefficient).
llvm-svn: 338121
Even though gas doesn't document it, it has been supported there for
a very long time.
This produces the 32 bit relative virtual address (aka image relative
address) for a given symbol. ".rva foo" is essentially equal to
".long foo@imgrel".
Differential Revision: https://reviews.llvm.org/D49821
llvm-svn: 338063
- Some of the v8.3 pointer authentication instruction inhabit the Hint space
- These instructions can be assembled to hint instructions which act as NOP instructions prior to v8.3
- This patch permits using the hint instructions for all v8a targets
- Also, correct the RETA{A,B} instructions to match the instruction attributes of RET (set isTerminator and isBarrier)
Differential Revision: https://reviews.llvm.org/D49786
llvm-svn: 338029
This adds MC support for the crypto instructions that were made optional
extensions in Armv8.2-A (AArch64 only).
Differential Revision: https://reviews.llvm.org/D49370
llvm-svn: 338010
--strip-trailing-cr is a diffutils option which is also available on
BSD-licensed diff introduced in FreeBSD 11.2, however, it has a bug
comparing files mixing \r and \r\n. Use -b (POSIX) instead.
llvm-svn: 338008
The target independent AsmParser doesn't recognise .hword, .word, .dword
which are required for Mips. Currently MipsAsmParser recognises these
through dispatch to MipsAsmParser::parseDataDirective. This contains
equivalent logic to AsmParser::parseDirectiveValue. This patch allows
reuse of AsmParser::parseDirectiveValue by making use of
addAliasForDirective to support .hword, .word and .dword.
Original patch provided by Alex Bradbury at D47001 was modified to fix
handling of microMIPS symbols. The `AsmParser::parseDirectiveValue`
calls either `EmitIntValue` or `EmitValue`. In this patch we override
`EmitIntValue` in the `MipsELFStreamer` to clear a pending set of
microMIPS symbols.
Differential revision: https://reviews.llvm.org/D49539
llvm-svn: 337893
This patch adds the following instructions:
RBIT reverse bits within each active elemnt (predicated), e.g.
rbit z0.d, p0/m, z1.d
for 8, 16, 32 and 64 bit elements.
REV reverse order of elements in data/predicate vector
(unpredicated), e.g.
rev z0.d, z1.d
rev p0.d, p1.d
for 8, 16, 32 and 64 bit elements.
REVB reverse order of bytes within each active element, e.g.
revb z0.d, p0/m, z1.d
for 16, 32 and 64 bit elements.
REVH reverse order of 16-bit half-words within each active
element, e.g.
revh z0.d, p0/m, z1.d
for 32 and 64 bit elements.
REVW reverse order of 32-bit words within each active element,
e.g.
revw z0.d, p0/m, z1.d
for 64 bit elements.
llvm-svn: 337534
This patch adds support for the following unpredicated
floating-point instructions:
FADD Floating point add
FSUB Floating point subtract
FMUL Floating point multiplication
FTSMUL Floating point trigonometric starting value
FRECPS Floating point reciprocal step
FRSQRTS Floating point reciprocal square root step
The instructions have the following assembly format:
fadd z0.h, z1.h, z2.h
and have variants for 16, 32 and 64-bit FP elements.
llvm-svn: 337383
The signed/unsigned DOT instructions perform a dot-product on
quadtuplets from two source vectors and accumulate the result in
the destination register. The instructions come in two forms:
Vector form, e.g.
sdot z0.s, z1.b, z2.b - signed dot product on four 8-bit quad-tuplets,
accumulating results in 32-bit elements.
udot z0.d, z1.h, z2.h - unsigned dot product on four 16-bit quad-tuplets,
accumulating results in 64-bit elements.
Indexed form, e.g.
sdot z0.s, z1.b, z2.b[3] - signed dot product on four 8-bit quad-tuplets
with specified quadtuplet from second
source vector, accumulating results in 32-bit
elements.
udot z0.d, z1.h, z2.h[1] - dot product on four 16-bit quad-tuplets
with specified quadtuplet from second
source vector, accumulating results in 64-bit
elements.
llvm-svn: 337372
This patch adds the following predicated instructions:
UDIV Unsigned divide active elements
UDIVR Unsigned divide active elements, reverse form.
SDIV Signed divide active elements
SDIVR Signed divide active elements, reverse form.
e.g.
udiv z0.s, p0/m, z0.s, z1.s
(unsigned divide active elements in z0 by z1, store result in z0)
sdivr z0.s, p0/m, z0.s, z1.s
(signed divide active elements in z1 by z0, store result in z0)
llvm-svn: 337369
This patch adds the following instructions:
MUL - multiply vectors, e.g.
mul z0.h, p0/m, z0.h, z1.h
- multiply with immediate, e.g.
mul z0.h, z0.h, #127
SMULH - signed multiply returning high half, e.g.
smulh z0.h, p0/m, z0.h, z1.h
UMULH - unsigned multiply returning high half, e.g.
umulh z0.h, p0/m, z0.h, z1.h
llvm-svn: 337358
This is the lead-up to having SPE codegen. Add the rest of the
instructions, along with MC tests.
Differential Revision: https://reviews.llvm.org/D44829
llvm-svn: 337346
This patch completes support for the following floating point
instructions that take FP immediates:
FADD* (addition)
FSUB (subtract)
FSUBR (subtract reverse form)
FMUL* (multiplication)
FMAX* (maximum)
FMAXNM (maximum number)
FMIN (maximum)
FMINNM (maximum number)
All operations are predicated and take a FP immediate operand,
e.g.
fadd z0.h, p0/m, z0.h, #0.5
fmin z0.s, p0/m, z0.s, #1.0
^___________^ (tied)
* Instructions added in a previous patch.
llvm-svn: 337272
The SPLICE instruction splices two vectors into one vector using a
predicate. It copies the active elements from the first vector, and
then fills the remaining elements with the low-numbered elements from
the second vector.
The instruction has the following form, e.g.
splice z0.b, p0, z0.b, z1.b
for 8-bit elements. It also supports 16, 32 and
64-bit elements.
llvm-svn: 337253
This patch adds an instruction that allows extracting
a vector from a pair of vectors, given an immediate index
that describes the element position to extract from.
The instruction has the following assembly:
ext z0.b, z0.b, z1.b, #imm
where #imm is an immediate between 0 and 255.
llvm-svn: 337251
This support was partial and temporary. Now that we have
wasm object file support its no longer needed.
Differential Revision: https://reviews.llvm.org/D48744
llvm-svn: 337222
This patch adds support for the following unpack instructions:
- PUNPKLO, PUNPKHI Unpack elements from low/high half and
place into elements of twice their size.
e.g. punpklo p0.h, p0.b
- UUNPKLO, UUNPKHI Unpack elements from low/high half and
SUNPKLO, SUNPKHI place into elements of twice their size
after zero- or sign-extending the values.
e.g. uunpklo z0.h, z0.b
llvm-svn: 336982
AT_NAME was being emitted before the directory paths were remapped. This
ensures that all paths are remapped before anything is emitted.
An additional test case has been added.
Note that this only works if the replacement string is an absolute path.
If not, then AT_decl_file believes the new path is a relative path, and
joins that path with the compilation directory. I do not know of a good
way to resolve this.
Patch by: Siddhartha Bagaria (starsid)
Differential revision: https://reviews.llvm.org/D49169
llvm-svn: 336793
The compact instruction shuffles active elements of vector
into lowest numbered elements and sets remaining elements
to zero.
e.g.
compact z0.s, p0, z1.s
llvm-svn: 336789
The LASTB and LASTA instructions extract the last active element,
or element after the last active, from the source vector.
The added variants are:
Scalar:
last(a|b) w0, p0, z0.b
last(a|b) w0, p0, z0.h
last(a|b) w0, p0, z0.s
last(a|b) x0, p0, z0.d
SIMD & FP Scalar:
last(a|b) b0, p0, z0.b
last(a|b) h0, p0, z0.h
last(a|b) s0, p0, z0.s
last(a|b) d0, p0, z0.d
The CLASTB and CLASTA conditionally extract the last or element after
the last active element from the source vector.
The added variants are:
Scalar:
clast(a|b) w0, p0, w0, z0.b
clast(a|b) w0, p0, w0, z0.h
clast(a|b) w0, p0, w0, z0.s
clast(a|b) x0, p0, x0, z0.d
SIMD & FP Scalar:
clast(a|b) b0, p0, b0, z0.b
clast(a|b) h0, p0, h0, z0.h
clast(a|b) s0, p0, s0, z0.s
clast(a|b) d0, p0, d0, z0.d
Vector:
clast(a|b) z0.b, p0, z0.b, z1.b
clast(a|b) z0.h, p0, z0.h, z1.h
clast(a|b) z0.s, p0, z0.s, z1.s
clast(a|b) z0.d, p0, z0.d, z1.d
Please refer to the architecture specification for more details on
the semantics of the added instructions.
llvm-svn: 336783
debug compilation dir when compiling assembly files with -g.
Part of PR38050.
Patch by Siddhartha Bagaria!
Differential Revision: https://reviews.llvm.org/D48988
llvm-svn: 336680
This patch adds support for the following instructions:
CLS (Count Leading Sign bits)
CLZ (Count Leading Zeros)
CNT (Count non-zero bits)
CNOT (Logically invert boolean condition in vector)
NOT (Bitwise invert vector)
FABS (Floating-point absolute value)
FNEG (Floating-point negate)
All operations are predicated and unary, e.g.
clz z0.s, p0/m, z1.s
- CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32
and 64 bit elements.
- FABS and FNEG have variants for 16, 32 and 64 bit elements.
llvm-svn: 336677
This patch adds support for the following instructions:
CNTB CNTH - Determine the number of active elements implied by
CNTW CNTD the named predicate constant, multiplied by an
immediate, e.g.
cnth x0, vl8, #16
CNTP - Count active predicate elements, e.g.
cntp x0, p0, p1.b
counts the number of active elements in p1, predicated
by p0, and stores the result in x0.
llvm-svn: 336552
This patch completes support for shifts, which include:
- LSL - Logical Shift Left
- LSLR - Logical Shift Left, Reversed form
- LSR - Logical Shift Right
- LSRR - Logical Shift Right, Reversed form
- ASR - Arithmetic Shift Right
- ASRR - Arithmetic Shift Right, Reversed form
- ASRD - Arithmetic Shift Right for Divide
In the following variants:
- Predicated shift by immediate - ASR, LSL, LSR, ASRD
e.g.
asr z0.h, p0/m, z0.h, #1
(active lanes of z0 shifted by #1)
- Unpredicated shift by immediate - ASR, LSL*, LSR*
e.g.
asr z0.h, z1.h, #1
(all lanes of z1 shifted by #1, stored in z0)
- Predicated shift by vector - ASR, LSL*, LSR*
e.g.
asr z0.h, p0/m, z0.h, z1.h
(active lanes of z0 shifted by z1, stored in z0)
- Predicated shift by vector, reversed form - ASRR, LSLR, LSRR
e.g.
lslr z0.h, p0/m, z0.h, z1.h
(active lanes of z1 shifted by z0, stored in z0)
- Predicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, p0/m, z0.h, z1.d
(active lanes of z0 shifted by wide elements of vector z1)
- Unpredicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, z1.h, z2.d
(all lanes of z1 shifted by wide elements of z2, stored in z0)
*Variants added in previous patches.
llvm-svn: 336547
Support for SVE's TBL instruction for programmable table
lookup/permute using vector of element indices, e.g.
tbl z0.d, { z1.d }, z2.d
stores elements from z1, indexed by elements from z2, into z0.
llvm-svn: 336544
This patch adds support for:
UZP1 Concatenate even elements from two vectors
UZP2 Concatenate odd elements from two vectors
TRN1 Interleave even elements from two vectors
TRN2 Interleave odd elements from two vectors
With variants for both data and predicate vectors, e.g.
uzp1 z0.b, z1.b, z2.b
trn2 p0.s, p1.s, p2.s
llvm-svn: 336531
Summary:
If LOCK prefix is not the first prefix in an instruction, LLVM
disassembler silently drops the prefix.
The fix is to select a proper instruction with a builtin LOCK prefix if
one exists.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49001
llvm-svn: 336400
a deficiency in TableGen that has been addressed in r336334.
[AArch64][SVE] Asm: Support for predicated FP rounding instructions.
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336387
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336322
Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.
This change adds negative immediates support and respective tests
for the following:
ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W
Differential Revision: https://reviews.llvm.org/D48649
llvm-svn: 336286
This patch adds both a vector and an immediate form, e.g.
- Vector form:
subr z0.h, p0/m, z0.h, z1.h
subtract active elements of z0 from z1, and store the result in z0.
- Immediate form:
subr z0.h, z0.h, #255
subtract elements of z0, and store the result in z0.
llvm-svn: 336274
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.
In short:
SVE alias => AArch64 name
--------------------------
NONE => EQ
ANY => NE
NLAST => HS
LAST => LO
FIRST => MI
NFRST => PL
PMORE => HI
PLAST => LS
TCONT => GE
TSTOP => LT
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48869
llvm-svn: 336245
This might make the error message added in r335668 unneeded, but I'm not sure yet.
The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.
llvm-svn: 336217
Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:
fmul z0.s, z1.s, z2.s[0]
which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.
This patch adds restricted register classes for SVE vectors:
ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements.
ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48823
llvm-svn: 336205
This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.
In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.
This patch replaces the assert with an error.
Differential Revision: https://reviews.llvm.org/D48517
llvm-svn: 336127
This was my mistake for only running test/MC/X86 and test/CodeGen/X86.
Arguably .word should be removed from this test, as it is not supported
universally.
llvm-svn: 336107
Increment/decrement vector by multiple of predicate constraint
element count.
The variants added by this patch are:
- INCH, INCW, INC
and (saturating):
- SQINCH, SQINCW, SQINCD
- UQINCH, UQINCW, UQINCW
- SQDECH, SQINCW, SQINCD
- UQDECH, UQINCW, UQINCW
For example:
incw z0.s, all, mul #4
llvm-svn: 336090
These patches were previously reverted as they led to
buildbot time-outs caused by large switch statement in
printAliasInstr when using UBSan and O3. The issue has
been addressed with a workaround (r335525).
llvm-svn: 336079
section flags in the ELF assembler. This matches the defaults
given in the rest of MC.
Fixes PR37997 where we couldn't assemble our own assembly output
without warnings.
llvm-svn: 336072
Mostly just adding checks for Thumb2 instructions which correspond to
ARM instructions which already had diagnostics. While I'm here, also fix
ARM-mode strd to check the input registers correctly.
Differential Revision: https://reviews.llvm.org/D48610
llvm-svn: 335909
=== Generating the CG Profile ===
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335794
The %eiz/%riz are dummy registers that force the encoder to emit a SIB byte when it normally wouldn't. By emitting them in the disassembly output we ensure that assembling the disassembler output would also produce a SIB byte.
This should match the behavior of objdump from binutils.
llvm-svn: 335768
When the condition code for an IT instruction is "AL" we get strange "15"
predicates on subsequent instructions. These are dealt with for most
instructions by treating them as "ARMCC::AL", but VFP takes a different path
which didn't have this code.
llvm-svn: 335594
IT instructions are allowed to have the 'AL' predicate, but it must never
result in an 'NV' predicated instruction. Essentially this means that all
branches must be 't' rather than 'e' if the predicate is 'AL'.
This patch adds a diagnostic for this during assembly (error because parsing
hits an assertion if allowed to continue) and an annotation during disassembly.
llvm-svn: 335593
Move expected-fail cases from directive-cpu.s to
directive-cpu-err.s. This allows us to remove the 'not' from the
llvm-mc invocation in directive-cpu.s so that this test will fail
in unexpected error cases. It also means that we are not relying
on all stderr coming before any stdout, which seems fragile.
Also make use of CHECK-NEXT to ensure that multiline error messages
really are occuring together.
And add a test to verify that .cpu with an arch version as extension
is rejected.
Differential Revision: https://reviews.llvm.org/D47873
llvm-svn: 335586
These were specifying an architecture version with .cpu directive,
which is invalid. As the error for this case outputs the problem
instruction we were still matching the expectations of FileCheck.
This patch fixes up the LSE tests to do what they seem to intend. A
follow-up patch will tighten up the directive tests.
Differential Revision: https://reviews.llvm.org/D47872
llvm-svn: 335585
-Ensure EIP isn't used with an index reigster.
-Ensure EIP isn't used as index register.
-Ensure base register isn't a vector register.
-Ensure eiz/riz usage matches the size of their base register.
llvm-svn: 335412
This allows us to check these:
-16-bit addressing doesn't support scale so we should error if we find one there.
-Multiplying ESP/RSP by a scale even if the scale is 1 should be an error because ESP/RSP can't be an index.
llvm-svn: 335398
By default, the second register gets assigned to the index register slot. But ESP can't be an index register so we need to swap it with the other register.
There's still a slight bug that we allow [EAX+ESP*1]. The existence of the multiply even though its with 1 should force ESP to the index register and trigger an error, but it doesn't currently.
llvm-svn: 335394
The second register is the index register and should only be %si or %di if used with a base register. And in that case the base register should be %bp or %bx.
This makes us compatible with gas.
We do still need to support both orders with Intel syntax which uses [bp+si] and [si+bp]
llvm-svn: 335384
(%bp) can't be encoded without a displacement. The encoding is instead used for displacement alone. So a 1 byte displacement of 0 must be used. But if there is an index register we can encode without a displacement.
llvm-svn: 335379
DWARF v5 explicitly represents file #0 in the line table. Prior
versions did not, so ".loc 0" is still an error in those cases.
Differential Revision: https://reviews.llvm.org/D48452
llvm-svn: 335350
This is the first pass in the main pipeline to use the legacy PM's
ability to run function analyses "on demand". Unfortunately, it turns
out there are bugs in that somewhat-hacky approach. At the very least,
it leaks memory and doesn't support -debug-pass=Structure. Unclear if
there are larger issues or not, but this should get the sanitizer bots
back to green by fixing the memory leaks.
llvm-svn: 335320
This patch adds support for generating a call graph profile from Branch Frequency Info.
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335306
Update AMDGPU assembler syntax behind the code-object-v3 feature:
* Replace/rename most AMDGPU assembler directives/symbols and document them.
* Provide more diagnostics (e.g. values out of range, missing values, repeated
values).
* Provide path for backwards compatibility, even with underlying descriptor
changes.
Differential Revision: https://reviews.llvm.org/D47736
llvm-svn: 335281
Summary:
When expanding the PseudoTail in expandFunctionCall() we were using X6
to save the return address. Since this is a tail call the return
address is not needed, this patch replaces it with X0 to be ignored.
This matches the behaviour listed in the ISA V2.2 document page 110.
tail offset -----> jalr x0, x6, offset
GCC exhibits the same behavior.
Reviewers: apazos, asb, mgrang
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01
Differential Revision: https://reviews.llvm.org/D48343
llvm-svn: 335239
Summary:
For sample and gather ops, we can accurately determine the set of
vaddr-size instruction variants that are required. This reduces
the size of instruction tables by ~5%.
The number of machine instruction opcodes is reduced from 10002
to 9476.
Change-Id: Ie7fc65d3657b762c7816017fe70b2e9bec644a8a
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D48168
llvm-svn: 335232
Summary:
This allows us to reduce the number of different machine instruction
opcodes, which reduces the table sizes and helps flatten the TableGen
multiclass hierarchies.
We can do this because for each hardware MIMG opcode, we have a full set
of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata
and vaddr registers. Instead of having separate D16 machine instructions,
a packed D16 instructions loading e.g. 4 components can simply use the
same V2 opcode variant that non-D16 instructions use.
We still require a TSFlag for D16 buffer instructions, because the
D16-ness of buffer instructions is part of the opcode. Renaming the flag
should help avoid future confusion.
The one non-obvious code change is that for gather4 instructions, the
disassembler can no longer automatically decide whether to use a V2 or
a V4 variant. The existing logic which choose the correct variant for
other MIMG instruction is extended to cover gather4 as well.
As a bonus, some of the assembler error messages are now more helpful
(e.g., complaining about a wrong data size instead of a non-existing
instruction).
While we're at it, delete a whole bunch of dead legacy TableGen code.
Change-Id: I89b02c2841c06f95e662541433e597f5d4553978
Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor
Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D47434
llvm-svn: 335222
These instructions were renamed in version 2.2 of the user-level ISA spec, but
the old name should also be accepted by standard tools.
llvm-svn: 335154
These are produced by GCC and supported by GAS, but not currently contained in
the pseudoinstruction listing in the RISC-V ISA manual.
llvm-svn: 335127
These are produced by GCC and supported by GAS, but not currently contained in
the pseudoinstruction listing in the RISC-V ISA manual.
llvm-svn: 335120
This value is the first vector instruction type in numerical order. The
previous value was incorrect, leaving TypeCVI_GATHER outside of the range
for vector instructions. This caused vector .new instructions to be
incorrectly encoded in the presence of gather.
llvm-svn: 335065
There are no provided instruction definitions for this architecture.
Reviewers: smaksimovic, atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D48320
llvm-svn: 335057
Previously, some aliases were marked as not being available for microMIPS32R6,
but this was overridden at the top level.
Reviewers: atanasyan, abeserminji, smaksimovic
Differential Revision: https://reviews.llvm.org/D48321
llvm-svn: 335053
Summary:
One for register based, much like the existing definitions,
and one for stack based (suffix _S).
This allows us to use registers in most of LLVM (which works better),
and stack based in MC (which results in a simpler and more readable
assembler / disassembler).
Tried to keep this change as small as possible while passing tests,
follow-up commit will:
- Add reg->stack conversion in MI.
- Fix asm/disasm in MC to be stack based.
- Fix emitter to be stack based.
tests passing:
llvm-lit -v `find test -name WebAssembly`
test/CodeGen/WebAssembly
test/MC/WebAssembly
test/MC/Disassembler/WebAssembly
test/DebugInfo/WebAssembly
test/CodeGen/MIR/WebAssembly
test/tools/llvm-objdump/WebAssembly
Reviewers: dschuff, sbc100, jgravelle-google, sunfish
Subscribers: aheejin, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D48183
llvm-svn: 334985
This patch uses the DiagnosticPredicate for SVE predicate patterns
to improve their diagnostics, now giving a 'invalid operand' diagnostic
if the type is not an immediate or one of the expected pattern
labels.
Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48220
llvm-svn: 334983
The variants added by this patch are:
- SQINC signed increment, e.g. sqinc x0, w0, all, mul #4
- SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4
- UQINC unsigned increment, e.g. uqinc w0, all, mul #4
- UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4
This patch includes asmparser changes to parse a GPR64 as a GPR32 in
order to satisfy the constraint check:
x0 == GPR64(w0)
in:
sqinc x0, w0, all, mul #4
^___^ (must match)
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47716
llvm-svn: 334980
This patch adds instructions for comparing elements from two vectors, e.g.
cmpgt p0.s, p0/z, z0.s, z1.s
and also adds support for comparing to a 64-bit wide element vector, e.g.
cmpgt p0.s, p0/z, z0.s, z1.d
The patch also contains aliases for certain comparisons, e.g.:
cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s
llvm-svn: 334931
We already have these aliases for EVEX enocded instructions, but not for the GPR, MMX, SSE, and VEX versions.
Also remove the vpextrw.s EVEX alias. That's not something gas implements.
llvm-svn: 334922
The .s assembly strings allow the reversed forms to be targeted from assembly which matches gas behavior. But when printing the instructions we should print them without the .s to match other tooling like objdump. By using InstAliases we can use the normal string in the instruction and just hide it from the assembly parser.
Ideally we'd add the .s versions to the legacy SSE and VEX versions as well for full compatibility with gas. Not sure how we got to state where only EVEX was supported.
llvm-svn: 334920
Unlike CodeGenInstruction, CodeGenInstAlias was flatting asm strings in its constructor. For instructions it was the users responsibility to flatten the string.
AsmMatcherEmitter didn't know this and treated them the same. This caused double flattening of InstAliases. This is mostly harmless unless the desired assembly string contains curly braces. The second flattening wouldn't know to ignore these and would remove the curly braces. And for variant 1 it would remove the contents of them as well.
To mitigate this, this patch makes removes the flattening from the CodeGenIntAlias constructor and modifies AsmWriterEmitter to account for the flattening not having been done.
llvm-svn: 334919
Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.
llvm-svn: 334905
Enables using the high and high-adjusted symbol modifiers on thread local
storage modifers in powerpc assembly. Needed to be able to support 64 bit
thread-pointer and dynamic-thread-pointer access sequences.
Differential Revision: https://reviews.llvm.org/D47754
llvm-svn: 334856
Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly.
The modifiers represent accessing the segment consiting of bits 16-31 of a
64-bit address/offset.
Differential Revision: https://reviews.llvm.org/D47729
llvm-svn: 334855
Increment/decrement scalar register by (scaled) element count given by
predicate pattern, e.g. 'incw x0, all, mul #4'.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47713
llvm-svn: 334838
WebAssembly doesn't support more than one function per section
and we rely on function sections being unique. This change ignores
the section provided by the function to avoid two functions being
in the same section.
Without this change the object writer produces the following
error for this test:
LLVM ERROR: section already has a defining function: baz
Differential Revision: https://reviews.llvm.org/D48178
llvm-svn: 334752
In some cases, for example when compiling a preprocessed file, the
front-end is not able to provide an MD5 checksum for all files. When
that happens, omit the MD5 checksums from the final DWARF, because
DWARF doesn't have a way to indicate that some but not all files have
a checksum.
When assembling a .s file, and some but not all .file directives
provide an MD5 checksum, issue a warning and don't emit MD5 into the
DWARF.
Fixes PR37623.
Differential Revision: https://reviews.llvm.org/D48135
llvm-svn: 334710
All COFF targets should use @IMGREL32 relocations for symbol differences
against __ImageBase. Do the same for getSectionForConstant, so that
immediates lowered to globals get merged across TUs.
Patch by Chris January
Differential Revision: https://reviews.llvm.org/D47783
llvm-svn: 334523
Summary:
This is similar to D46319 (ARM). x86-64 psABI p40 gives an example:
leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 # GOTPC32 reloc
GNU as creates R_X86_64_GOTPC32. However, MC currently emits R_X86_64_PC32.
Reviewers: javed.absar, echristo
Subscribers: kristof.beyls, llvm-commits, peter.smith, grimar
Differential Revision: https://reviews.llvm.org/D47507
llvm-svn: 334515
Don't provide the assembler source as the "root file" unless the user
asked to have debug info for the assembler source (with -g).
If the source doesn't provide an explicit ".file 0" then (a) use the
compilation directory as directory #0, and (b) use the file #1 info
for file #0 also.
Differential Revision: https://reviews.llvm.org/D48055
llvm-svn: 334512
Summary: When compiling with -fpic, in contrast to -fPIC, use only the
immediate field to index into the GOT. This saves space if the GOT is
known to be small. The linker will warn if the GOT is too large for
this method.
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: brad, fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D47136
llvm-svn: 334383
The instruction makes use of a previously ignored field in the fence
instruction. It is introduced in the version 2.3 draft of the RISC-V
specification after much work by the Memory Model Task Group.
As clarified here <https://github.com/riscv/riscv-isa-manual/issues/186>,
the fence.tso assembler mnemonic does not have operands.
llvm-svn: 334278
The implementation follows the MIPS backend and expands the pseudo instruction
directly during asm parsing. As the result, only real MC instructions are
emitted to the MCStreamer. The actual expansion to real instructions is
similar to the expansion performed by the GNU Assembler.
This patch supersedes D41949.
Differential Revision: https://reviews.llvm.org/D46118
Patch by Mario Werner.
llvm-svn: 334203
These encodings correspond to the cases in the normal encoding scheme where there is no index and our modrm reading code initially decodes it as such. The VSIB handling code tried to compensate for this, but failed to add the base needed to make later code do the right thing.
Fixes PR37712.
llvm-svn: 334121
This causes "permission denied" error in some controlled test environment where source tree is read-only.
Differential Revision: https://reviews.llvm.org/D47839
llvm-svn: 334114
The test changes in r334031 give unstable pass/fail results on the
llvm-clang-x86_64-expensive-checks-win buildbot. Revert the test changes to
turn the bot green.
llvm-svn: 334084
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.
Differential Revision: https://reviews.llvm.org/D44928
llvm-svn: 334078
Summary:
Allow extended parsing of variable assembler assignment syntax and modify X86 to permit
VAR = register assignment. As we emit these as .set directives when possible, we inline
such expressions in output assembly.
Fixes PR37425.
Reviewers: rnk, void, echristo
Reviewed By: rnk
Subscribers: nickdesaulniers, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D47545
llvm-svn: 334022
When the branch target of a Thumb2 unconditional or conditonal branch is
resolved at assembly time, no range checking is performed on the result
leading to incorrect immediates. This change adds a range check:
+- 16 Megabytes for unconditional branches, +- 1 Megabyte for the
conditional branch.
Differential Revision: https://reviews.llvm.org/D46306
llvm-svn: 333997
The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending
on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing
check for BL range is incorrectly set at +- 32 Megabytes. This change
corrects the higher range and uses the lower range if the featurebits
don't have the necessary support for it.
Differential Revision: https://reviews.llvm.org/D46305
llvm-svn: 333991
For immediates used in DUP instructions that have the range
-128 to 127, or a multiple of 256 in the range -32768 to 32512,
one could argue that when the result element size is 16bits (.h),
the value can be considered both signed and unsigned.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47619
llvm-svn: 333873
Print the first indexed element as a FP register, for example:
mov z0.d, z1.d[0]
Is now printed as:
mov z0.d, d1
Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47571
llvm-svn: 333872
Unpredicated copy of indexed SVE element to SVE vector,
along with MOV-aliases.
For example:
dup z0.h, z1.h[0]
duplicates the first 16-bit element from z1 to all elements in
the result vector z0.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47570
llvm-svn: 333871
Predicated copy of floating-point immediate value to SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47518
llvm-svn: 333869
Predicated copy of possibly shifted immediate value into SVE
vector, along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47517
llvm-svn: 333868
Object FIle Representation
At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like:
.cg_profile a, b, 32
.cg_profile freq, a, 11
.cg_profile freq, b, 20
When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight).
Differential Revision: https://reviews.llvm.org/D44965
llvm-svn: 333823
The `MipsAsmParser::loadImmediate` can load immediates of various sizes
into a register. Idea of this change is to use `loadImmediate` in the
`MipsAsmParser::expandMemInst` method to load offset into a register and
then call required load/store instruction.
The patch removes separate `expandLoadInst` and `expandStoreInst`
methods and does everything in the `expandMemInst` method to escape code
duplication.
Differential Revision: https://reviews.llvm.org/D47316
llvm-svn: 333774
Supporting GOT and TLS related relocations by the `.reloc` directive is
useful for purpose of testing various tools like a linker, for example.
llvm-svn: 333773
Unpredicated copy of floating-point immediate value into SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47482
llvm-svn: 333744