The SPLICE instruction splices two vectors into one vector using a
predicate. It copies the active elements from the first vector, and
then fills the remaining elements with the low-numbered elements from
the second vector.
The instruction has the following form, e.g.
splice z0.b, p0, z0.b, z1.b
for 8-bit elements. It also supports 16, 32 and
64-bit elements.
llvm-svn: 337253
This patch adds an instruction that allows extracting
a vector from a pair of vectors, given an immediate index
that describes the element position to extract from.
The instruction has the following assembly:
ext z0.b, z0.b, z1.b, #imm
where #imm is an immediate between 0 and 255.
llvm-svn: 337251
This support was partial and temporary. Now that we have
wasm object file support its no longer needed.
Differential Revision: https://reviews.llvm.org/D48744
llvm-svn: 337222
This patch adds support for the following unpack instructions:
- PUNPKLO, PUNPKHI Unpack elements from low/high half and
place into elements of twice their size.
e.g. punpklo p0.h, p0.b
- UUNPKLO, UUNPKHI Unpack elements from low/high half and
SUNPKLO, SUNPKHI place into elements of twice their size
after zero- or sign-extending the values.
e.g. uunpklo z0.h, z0.b
llvm-svn: 336982
AT_NAME was being emitted before the directory paths were remapped. This
ensures that all paths are remapped before anything is emitted.
An additional test case has been added.
Note that this only works if the replacement string is an absolute path.
If not, then AT_decl_file believes the new path is a relative path, and
joins that path with the compilation directory. I do not know of a good
way to resolve this.
Patch by: Siddhartha Bagaria (starsid)
Differential revision: https://reviews.llvm.org/D49169
llvm-svn: 336793
The compact instruction shuffles active elements of vector
into lowest numbered elements and sets remaining elements
to zero.
e.g.
compact z0.s, p0, z1.s
llvm-svn: 336789
The LASTB and LASTA instructions extract the last active element,
or element after the last active, from the source vector.
The added variants are:
Scalar:
last(a|b) w0, p0, z0.b
last(a|b) w0, p0, z0.h
last(a|b) w0, p0, z0.s
last(a|b) x0, p0, z0.d
SIMD & FP Scalar:
last(a|b) b0, p0, z0.b
last(a|b) h0, p0, z0.h
last(a|b) s0, p0, z0.s
last(a|b) d0, p0, z0.d
The CLASTB and CLASTA conditionally extract the last or element after
the last active element from the source vector.
The added variants are:
Scalar:
clast(a|b) w0, p0, w0, z0.b
clast(a|b) w0, p0, w0, z0.h
clast(a|b) w0, p0, w0, z0.s
clast(a|b) x0, p0, x0, z0.d
SIMD & FP Scalar:
clast(a|b) b0, p0, b0, z0.b
clast(a|b) h0, p0, h0, z0.h
clast(a|b) s0, p0, s0, z0.s
clast(a|b) d0, p0, d0, z0.d
Vector:
clast(a|b) z0.b, p0, z0.b, z1.b
clast(a|b) z0.h, p0, z0.h, z1.h
clast(a|b) z0.s, p0, z0.s, z1.s
clast(a|b) z0.d, p0, z0.d, z1.d
Please refer to the architecture specification for more details on
the semantics of the added instructions.
llvm-svn: 336783
debug compilation dir when compiling assembly files with -g.
Part of PR38050.
Patch by Siddhartha Bagaria!
Differential Revision: https://reviews.llvm.org/D48988
llvm-svn: 336680
This patch adds support for the following instructions:
CLS (Count Leading Sign bits)
CLZ (Count Leading Zeros)
CNT (Count non-zero bits)
CNOT (Logically invert boolean condition in vector)
NOT (Bitwise invert vector)
FABS (Floating-point absolute value)
FNEG (Floating-point negate)
All operations are predicated and unary, e.g.
clz z0.s, p0/m, z1.s
- CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32
and 64 bit elements.
- FABS and FNEG have variants for 16, 32 and 64 bit elements.
llvm-svn: 336677
This patch adds support for the following instructions:
CNTB CNTH - Determine the number of active elements implied by
CNTW CNTD the named predicate constant, multiplied by an
immediate, e.g.
cnth x0, vl8, #16
CNTP - Count active predicate elements, e.g.
cntp x0, p0, p1.b
counts the number of active elements in p1, predicated
by p0, and stores the result in x0.
llvm-svn: 336552
This patch completes support for shifts, which include:
- LSL - Logical Shift Left
- LSLR - Logical Shift Left, Reversed form
- LSR - Logical Shift Right
- LSRR - Logical Shift Right, Reversed form
- ASR - Arithmetic Shift Right
- ASRR - Arithmetic Shift Right, Reversed form
- ASRD - Arithmetic Shift Right for Divide
In the following variants:
- Predicated shift by immediate - ASR, LSL, LSR, ASRD
e.g.
asr z0.h, p0/m, z0.h, #1
(active lanes of z0 shifted by #1)
- Unpredicated shift by immediate - ASR, LSL*, LSR*
e.g.
asr z0.h, z1.h, #1
(all lanes of z1 shifted by #1, stored in z0)
- Predicated shift by vector - ASR, LSL*, LSR*
e.g.
asr z0.h, p0/m, z0.h, z1.h
(active lanes of z0 shifted by z1, stored in z0)
- Predicated shift by vector, reversed form - ASRR, LSLR, LSRR
e.g.
lslr z0.h, p0/m, z0.h, z1.h
(active lanes of z1 shifted by z0, stored in z0)
- Predicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, p0/m, z0.h, z1.d
(active lanes of z0 shifted by wide elements of vector z1)
- Unpredicated shift left/right by wide vector - ASR, LSL, LSR
e.g.
lsl z0.h, z1.h, z2.d
(all lanes of z1 shifted by wide elements of z2, stored in z0)
*Variants added in previous patches.
llvm-svn: 336547
Support for SVE's TBL instruction for programmable table
lookup/permute using vector of element indices, e.g.
tbl z0.d, { z1.d }, z2.d
stores elements from z1, indexed by elements from z2, into z0.
llvm-svn: 336544
This patch adds support for:
UZP1 Concatenate even elements from two vectors
UZP2 Concatenate odd elements from two vectors
TRN1 Interleave even elements from two vectors
TRN2 Interleave odd elements from two vectors
With variants for both data and predicate vectors, e.g.
uzp1 z0.b, z1.b, z2.b
trn2 p0.s, p1.s, p2.s
llvm-svn: 336531
Summary:
If LOCK prefix is not the first prefix in an instruction, LLVM
disassembler silently drops the prefix.
The fix is to select a proper instruction with a builtin LOCK prefix if
one exists.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49001
llvm-svn: 336400
a deficiency in TableGen that has been addressed in r336334.
[AArch64][SVE] Asm: Support for predicated FP rounding instructions.
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336387
This patch also adds instructions for predicated FP square-root and
reciprocal exponent.
The added instructions are:
- FRINTI Round to integral value (current FPCR rounding mode)
- FRINTX Round to integral value (current FPCR rounding mode, signalling inexact)
- FRINTA Round to integral value (to nearest, with ties away from zero)
- FRINTN Round to integral value (to nearest, with ties to even)
- FRINTZ Round to integral value (toward zero)
- FRINTM Round to integral value (toward minus Infinity)
- FRINTP Round to integral value (toward plus Infinity)
- FSQRT Floating-point square root
- FRECPX Floating-point reciprocal exponent
llvm-svn: 336322
Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.
This change adds negative immediates support and respective tests
for the following:
ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W
Differential Revision: https://reviews.llvm.org/D48649
llvm-svn: 336286
This patch adds both a vector and an immediate form, e.g.
- Vector form:
subr z0.h, p0/m, z0.h, z1.h
subtract active elements of z0 from z1, and store the result in z0.
- Immediate form:
subr z0.h, z0.h, #255
subtract elements of z0, and store the result in z0.
llvm-svn: 336274
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.
In short:
SVE alias => AArch64 name
--------------------------
NONE => EQ
ANY => NE
NLAST => HS
LAST => LO
FIRST => MI
NFRST => PL
PMORE => HI
PLAST => LS
TCONT => GE
TSTOP => LT
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48869
llvm-svn: 336245
This might make the error message added in r335668 unneeded, but I'm not sure yet.
The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.
llvm-svn: 336217
Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:
fmul z0.s, z1.s, z2.s[0]
which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.
This patch adds restricted register classes for SVE vectors:
ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements.
ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48823
llvm-svn: 336205
This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
On darwin, all virtual sections have zerofill type, and having a
.zerofill directive in a non-virtual section is not allowed. Instead of
asserting, show a nicer error.
In order to use the equivalent of .zerofill in a non-virtual section,
the usage of .zero of .space is required.
This patch replaces the assert with an error.
Differential Revision: https://reviews.llvm.org/D48517
llvm-svn: 336127
This was my mistake for only running test/MC/X86 and test/CodeGen/X86.
Arguably .word should be removed from this test, as it is not supported
universally.
llvm-svn: 336107
Increment/decrement vector by multiple of predicate constraint
element count.
The variants added by this patch are:
- INCH, INCW, INC
and (saturating):
- SQINCH, SQINCW, SQINCD
- UQINCH, UQINCW, UQINCW
- SQDECH, SQINCW, SQINCD
- UQDECH, UQINCW, UQINCW
For example:
incw z0.s, all, mul #4
llvm-svn: 336090
These patches were previously reverted as they led to
buildbot time-outs caused by large switch statement in
printAliasInstr when using UBSan and O3. The issue has
been addressed with a workaround (r335525).
llvm-svn: 336079
section flags in the ELF assembler. This matches the defaults
given in the rest of MC.
Fixes PR37997 where we couldn't assemble our own assembly output
without warnings.
llvm-svn: 336072
Mostly just adding checks for Thumb2 instructions which correspond to
ARM instructions which already had diagnostics. While I'm here, also fix
ARM-mode strd to check the input registers correctly.
Differential Revision: https://reviews.llvm.org/D48610
llvm-svn: 335909
=== Generating the CG Profile ===
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
```
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
```
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335794
The %eiz/%riz are dummy registers that force the encoder to emit a SIB byte when it normally wouldn't. By emitting them in the disassembly output we ensure that assembling the disassembler output would also produce a SIB byte.
This should match the behavior of objdump from binutils.
llvm-svn: 335768
When the condition code for an IT instruction is "AL" we get strange "15"
predicates on subsequent instructions. These are dealt with for most
instructions by treating them as "ARMCC::AL", but VFP takes a different path
which didn't have this code.
llvm-svn: 335594
IT instructions are allowed to have the 'AL' predicate, but it must never
result in an 'NV' predicated instruction. Essentially this means that all
branches must be 't' rather than 'e' if the predicate is 'AL'.
This patch adds a diagnostic for this during assembly (error because parsing
hits an assertion if allowed to continue) and an annotation during disassembly.
llvm-svn: 335593
Move expected-fail cases from directive-cpu.s to
directive-cpu-err.s. This allows us to remove the 'not' from the
llvm-mc invocation in directive-cpu.s so that this test will fail
in unexpected error cases. It also means that we are not relying
on all stderr coming before any stdout, which seems fragile.
Also make use of CHECK-NEXT to ensure that multiline error messages
really are occuring together.
And add a test to verify that .cpu with an arch version as extension
is rejected.
Differential Revision: https://reviews.llvm.org/D47873
llvm-svn: 335586
These were specifying an architecture version with .cpu directive,
which is invalid. As the error for this case outputs the problem
instruction we were still matching the expectations of FileCheck.
This patch fixes up the LSE tests to do what they seem to intend. A
follow-up patch will tighten up the directive tests.
Differential Revision: https://reviews.llvm.org/D47872
llvm-svn: 335585
-Ensure EIP isn't used with an index reigster.
-Ensure EIP isn't used as index register.
-Ensure base register isn't a vector register.
-Ensure eiz/riz usage matches the size of their base register.
llvm-svn: 335412
This allows us to check these:
-16-bit addressing doesn't support scale so we should error if we find one there.
-Multiplying ESP/RSP by a scale even if the scale is 1 should be an error because ESP/RSP can't be an index.
llvm-svn: 335398
By default, the second register gets assigned to the index register slot. But ESP can't be an index register so we need to swap it with the other register.
There's still a slight bug that we allow [EAX+ESP*1]. The existence of the multiply even though its with 1 should force ESP to the index register and trigger an error, but it doesn't currently.
llvm-svn: 335394
The second register is the index register and should only be %si or %di if used with a base register. And in that case the base register should be %bp or %bx.
This makes us compatible with gas.
We do still need to support both orders with Intel syntax which uses [bp+si] and [si+bp]
llvm-svn: 335384
(%bp) can't be encoded without a displacement. The encoding is instead used for displacement alone. So a 1 byte displacement of 0 must be used. But if there is an index register we can encode without a displacement.
llvm-svn: 335379
DWARF v5 explicitly represents file #0 in the line table. Prior
versions did not, so ".loc 0" is still an error in those cases.
Differential Revision: https://reviews.llvm.org/D48452
llvm-svn: 335350
This is the first pass in the main pipeline to use the legacy PM's
ability to run function analyses "on demand". Unfortunately, it turns
out there are bugs in that somewhat-hacky approach. At the very least,
it leaks memory and doesn't support -debug-pass=Structure. Unclear if
there are larger issues or not, but this should get the sanitizer bots
back to green by fixing the memory leaks.
llvm-svn: 335320
This patch adds support for generating a call graph profile from Branch Frequency Info.
The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight.
After scanning all the functions, it generates an appending module flag containing the data. The format looks like:
!llvm.module.flags = !{!0}
!0 = !{i32 5, !"CG Profile", !1}
!1 = !{!2, !3, !4} ; List of edges
!2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32
!3 = !{void (i1)* @freq, void ()* @a, i64 11}
!4 = !{void (i1)* @freq, void ()* @b, i64 20}
Differential Revision: https://reviews.llvm.org/D48105
llvm-svn: 335306
Update AMDGPU assembler syntax behind the code-object-v3 feature:
* Replace/rename most AMDGPU assembler directives/symbols and document them.
* Provide more diagnostics (e.g. values out of range, missing values, repeated
values).
* Provide path for backwards compatibility, even with underlying descriptor
changes.
Differential Revision: https://reviews.llvm.org/D47736
llvm-svn: 335281
Summary:
When expanding the PseudoTail in expandFunctionCall() we were using X6
to save the return address. Since this is a tail call the return
address is not needed, this patch replaces it with X0 to be ignored.
This matches the behaviour listed in the ISA V2.2 document page 110.
tail offset -----> jalr x0, x6, offset
GCC exhibits the same behavior.
Reviewers: apazos, asb, mgrang
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01
Differential Revision: https://reviews.llvm.org/D48343
llvm-svn: 335239
Summary:
For sample and gather ops, we can accurately determine the set of
vaddr-size instruction variants that are required. This reduces
the size of instruction tables by ~5%.
The number of machine instruction opcodes is reduced from 10002
to 9476.
Change-Id: Ie7fc65d3657b762c7816017fe70b2e9bec644a8a
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D48168
llvm-svn: 335232
Summary:
This allows us to reduce the number of different machine instruction
opcodes, which reduces the table sizes and helps flatten the TableGen
multiclass hierarchies.
We can do this because for each hardware MIMG opcode, we have a full set
of IMAGE_xxx_Vn_Vm machine instructions for all required sizes of vdata
and vaddr registers. Instead of having separate D16 machine instructions,
a packed D16 instructions loading e.g. 4 components can simply use the
same V2 opcode variant that non-D16 instructions use.
We still require a TSFlag for D16 buffer instructions, because the
D16-ness of buffer instructions is part of the opcode. Renaming the flag
should help avoid future confusion.
The one non-obvious code change is that for gather4 instructions, the
disassembler can no longer automatically decide whether to use a V2 or
a V4 variant. The existing logic which choose the correct variant for
other MIMG instruction is extended to cover gather4 as well.
As a bonus, some of the assembler error messages are now more helpful
(e.g., complaining about a wrong data size instead of a non-existing
instruction).
While we're at it, delete a whole bunch of dead legacy TableGen code.
Change-Id: I89b02c2841c06f95e662541433e597f5d4553978
Reviewers: arsenm, rampitec, kzhuravl, artem.tamazov, dp, rtaylor
Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D47434
llvm-svn: 335222
These instructions were renamed in version 2.2 of the user-level ISA spec, but
the old name should also be accepted by standard tools.
llvm-svn: 335154
These are produced by GCC and supported by GAS, but not currently contained in
the pseudoinstruction listing in the RISC-V ISA manual.
llvm-svn: 335127
These are produced by GCC and supported by GAS, but not currently contained in
the pseudoinstruction listing in the RISC-V ISA manual.
llvm-svn: 335120
This value is the first vector instruction type in numerical order. The
previous value was incorrect, leaving TypeCVI_GATHER outside of the range
for vector instructions. This caused vector .new instructions to be
incorrectly encoded in the presence of gather.
llvm-svn: 335065
There are no provided instruction definitions for this architecture.
Reviewers: smaksimovic, atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D48320
llvm-svn: 335057
Previously, some aliases were marked as not being available for microMIPS32R6,
but this was overridden at the top level.
Reviewers: atanasyan, abeserminji, smaksimovic
Differential Revision: https://reviews.llvm.org/D48321
llvm-svn: 335053
Summary:
One for register based, much like the existing definitions,
and one for stack based (suffix _S).
This allows us to use registers in most of LLVM (which works better),
and stack based in MC (which results in a simpler and more readable
assembler / disassembler).
Tried to keep this change as small as possible while passing tests,
follow-up commit will:
- Add reg->stack conversion in MI.
- Fix asm/disasm in MC to be stack based.
- Fix emitter to be stack based.
tests passing:
llvm-lit -v `find test -name WebAssembly`
test/CodeGen/WebAssembly
test/MC/WebAssembly
test/MC/Disassembler/WebAssembly
test/DebugInfo/WebAssembly
test/CodeGen/MIR/WebAssembly
test/tools/llvm-objdump/WebAssembly
Reviewers: dschuff, sbc100, jgravelle-google, sunfish
Subscribers: aheejin, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D48183
llvm-svn: 334985
This patch uses the DiagnosticPredicate for SVE predicate patterns
to improve their diagnostics, now giving a 'invalid operand' diagnostic
if the type is not an immediate or one of the expected pattern
labels.
Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48220
llvm-svn: 334983
The variants added by this patch are:
- SQINC signed increment, e.g. sqinc x0, w0, all, mul #4
- SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4
- UQINC unsigned increment, e.g. uqinc w0, all, mul #4
- UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4
This patch includes asmparser changes to parse a GPR64 as a GPR32 in
order to satisfy the constraint check:
x0 == GPR64(w0)
in:
sqinc x0, w0, all, mul #4
^___^ (must match)
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47716
llvm-svn: 334980
This patch adds instructions for comparing elements from two vectors, e.g.
cmpgt p0.s, p0/z, z0.s, z1.s
and also adds support for comparing to a 64-bit wide element vector, e.g.
cmpgt p0.s, p0/z, z0.s, z1.d
The patch also contains aliases for certain comparisons, e.g.:
cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s
llvm-svn: 334931
We already have these aliases for EVEX enocded instructions, but not for the GPR, MMX, SSE, and VEX versions.
Also remove the vpextrw.s EVEX alias. That's not something gas implements.
llvm-svn: 334922
The .s assembly strings allow the reversed forms to be targeted from assembly which matches gas behavior. But when printing the instructions we should print them without the .s to match other tooling like objdump. By using InstAliases we can use the normal string in the instruction and just hide it from the assembly parser.
Ideally we'd add the .s versions to the legacy SSE and VEX versions as well for full compatibility with gas. Not sure how we got to state where only EVEX was supported.
llvm-svn: 334920
Unlike CodeGenInstruction, CodeGenInstAlias was flatting asm strings in its constructor. For instructions it was the users responsibility to flatten the string.
AsmMatcherEmitter didn't know this and treated them the same. This caused double flattening of InstAliases. This is mostly harmless unless the desired assembly string contains curly braces. The second flattening wouldn't know to ignore these and would remove the curly braces. And for variant 1 it would remove the contents of them as well.
To mitigate this, this patch makes removes the flattening from the CodeGenIntAlias constructor and modifies AsmWriterEmitter to account for the flattening not having been done.
llvm-svn: 334919
Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.
llvm-svn: 334905
Enables using the high and high-adjusted symbol modifiers on thread local
storage modifers in powerpc assembly. Needed to be able to support 64 bit
thread-pointer and dynamic-thread-pointer access sequences.
Differential Revision: https://reviews.llvm.org/D47754
llvm-svn: 334856
Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly.
The modifiers represent accessing the segment consiting of bits 16-31 of a
64-bit address/offset.
Differential Revision: https://reviews.llvm.org/D47729
llvm-svn: 334855
Increment/decrement scalar register by (scaled) element count given by
predicate pattern, e.g. 'incw x0, all, mul #4'.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47713
llvm-svn: 334838
WebAssembly doesn't support more than one function per section
and we rely on function sections being unique. This change ignores
the section provided by the function to avoid two functions being
in the same section.
Without this change the object writer produces the following
error for this test:
LLVM ERROR: section already has a defining function: baz
Differential Revision: https://reviews.llvm.org/D48178
llvm-svn: 334752
In some cases, for example when compiling a preprocessed file, the
front-end is not able to provide an MD5 checksum for all files. When
that happens, omit the MD5 checksums from the final DWARF, because
DWARF doesn't have a way to indicate that some but not all files have
a checksum.
When assembling a .s file, and some but not all .file directives
provide an MD5 checksum, issue a warning and don't emit MD5 into the
DWARF.
Fixes PR37623.
Differential Revision: https://reviews.llvm.org/D48135
llvm-svn: 334710
All COFF targets should use @IMGREL32 relocations for symbol differences
against __ImageBase. Do the same for getSectionForConstant, so that
immediates lowered to globals get merged across TUs.
Patch by Chris January
Differential Revision: https://reviews.llvm.org/D47783
llvm-svn: 334523
Summary:
This is similar to D46319 (ARM). x86-64 psABI p40 gives an example:
leaq _GLOBAL_OFFSET_TABLE(%rip), %r15 # GOTPC32 reloc
GNU as creates R_X86_64_GOTPC32. However, MC currently emits R_X86_64_PC32.
Reviewers: javed.absar, echristo
Subscribers: kristof.beyls, llvm-commits, peter.smith, grimar
Differential Revision: https://reviews.llvm.org/D47507
llvm-svn: 334515
Don't provide the assembler source as the "root file" unless the user
asked to have debug info for the assembler source (with -g).
If the source doesn't provide an explicit ".file 0" then (a) use the
compilation directory as directory #0, and (b) use the file #1 info
for file #0 also.
Differential Revision: https://reviews.llvm.org/D48055
llvm-svn: 334512
Summary: When compiling with -fpic, in contrast to -fPIC, use only the
immediate field to index into the GOT. This saves space if the GOT is
known to be small. The linker will warn if the GOT is too large for
this method.
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: brad, fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D47136
llvm-svn: 334383
The instruction makes use of a previously ignored field in the fence
instruction. It is introduced in the version 2.3 draft of the RISC-V
specification after much work by the Memory Model Task Group.
As clarified here <https://github.com/riscv/riscv-isa-manual/issues/186>,
the fence.tso assembler mnemonic does not have operands.
llvm-svn: 334278
The implementation follows the MIPS backend and expands the pseudo instruction
directly during asm parsing. As the result, only real MC instructions are
emitted to the MCStreamer. The actual expansion to real instructions is
similar to the expansion performed by the GNU Assembler.
This patch supersedes D41949.
Differential Revision: https://reviews.llvm.org/D46118
Patch by Mario Werner.
llvm-svn: 334203
These encodings correspond to the cases in the normal encoding scheme where there is no index and our modrm reading code initially decodes it as such. The VSIB handling code tried to compensate for this, but failed to add the base needed to make later code do the right thing.
Fixes PR37712.
llvm-svn: 334121
This causes "permission denied" error in some controlled test environment where source tree is read-only.
Differential Revision: https://reviews.llvm.org/D47839
llvm-svn: 334114
The test changes in r334031 give unstable pass/fail results on the
llvm-clang-x86_64-expensive-checks-win buildbot. Revert the test changes to
turn the bot green.
llvm-svn: 334084
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.
Differential Revision: https://reviews.llvm.org/D44928
llvm-svn: 334078
Summary:
Allow extended parsing of variable assembler assignment syntax and modify X86 to permit
VAR = register assignment. As we emit these as .set directives when possible, we inline
such expressions in output assembly.
Fixes PR37425.
Reviewers: rnk, void, echristo
Reviewed By: rnk
Subscribers: nickdesaulniers, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D47545
llvm-svn: 334022
When the branch target of a Thumb2 unconditional or conditonal branch is
resolved at assembly time, no range checking is performed on the result
leading to incorrect immediates. This change adds a range check:
+- 16 Megabytes for unconditional branches, +- 1 Megabyte for the
conditional branch.
Differential Revision: https://reviews.llvm.org/D46306
llvm-svn: 333997
The Thumb BL range is + or - either 16 Megabytes or 4 Megabytes depending
on whether the CPU supports Thumb2 or the v8-m baseline ops. The existing
check for BL range is incorrectly set at +- 32 Megabytes. This change
corrects the higher range and uses the lower range if the featurebits
don't have the necessary support for it.
Differential Revision: https://reviews.llvm.org/D46305
llvm-svn: 333991
For immediates used in DUP instructions that have the range
-128 to 127, or a multiple of 256 in the range -32768 to 32512,
one could argue that when the result element size is 16bits (.h),
the value can be considered both signed and unsigned.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47619
llvm-svn: 333873
Print the first indexed element as a FP register, for example:
mov z0.d, z1.d[0]
Is now printed as:
mov z0.d, d1
Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47571
llvm-svn: 333872
Unpredicated copy of indexed SVE element to SVE vector,
along with MOV-aliases.
For example:
dup z0.h, z1.h[0]
duplicates the first 16-bit element from z1 to all elements in
the result vector z0.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47570
llvm-svn: 333871
Predicated copy of floating-point immediate value to SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47518
llvm-svn: 333869
Predicated copy of possibly shifted immediate value into SVE
vector, along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47517
llvm-svn: 333868
Object FIle Representation
At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like:
.cg_profile a, b, 32
.cg_profile freq, a, 11
.cg_profile freq, b, 20
When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight).
Differential Revision: https://reviews.llvm.org/D44965
llvm-svn: 333823
The `MipsAsmParser::loadImmediate` can load immediates of various sizes
into a register. Idea of this change is to use `loadImmediate` in the
`MipsAsmParser::expandMemInst` method to load offset into a register and
then call required load/store instruction.
The patch removes separate `expandLoadInst` and `expandStoreInst`
methods and does everything in the `expandMemInst` method to escape code
duplication.
Differential Revision: https://reviews.llvm.org/D47316
llvm-svn: 333774
Supporting GOT and TLS related relocations by the `.reloc` directive is
useful for purpose of testing various tools like a linker, for example.
llvm-svn: 333773
Unpredicated copy of floating-point immediate value into SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47482
llvm-svn: 333744
Summary:
"Unknown" for platforms that were not manually added into the switch
did not make sense at all. Now it prints Target + addend for all
elf-machines that were not explicitly mentioned.
Addresses PR21059 and PR25124.
Original author: fedor.sergeev
Reviewers: jyknight, espindola, fedor.sergeev
Reviewed By: jyknight
Subscribers: eraman, dcederman, jfb, dschuff, aheejin, llvm-commits
Differential Revision: https://reviews.llvm.org/D36464
llvm-svn: 333726
A 5-bit value can occur when EVEX.X is 0 due to it being used to extend modrm.rm to encode XMM16-31. But if modrm.rm instead encodes a GPR, the Intel documentation says EVEX.X should be ignored so just mask it to 4 bits once we know its a GPR.
llvm-svn: 333725
EVEX.X is used to extended modrm.rm when the instruction encodes a XMM/YMM/ZMM register. But we aren't properly ignoring it when it encodes a GPR and we end up printing whatever registers exist in X86 register enum after the GPRs.
llvm-svn: 333724
This was an accidental side effect of EVEX.X being used to encode XMM16-XMM31 using modrm.rm with modrm.mod==0x3.
I think there are still more bugs related to this.
llvm-svn: 333722