Intrinsic already existed for llvm.SI.tbuffer.store
Needed tbuffer.load and also re-implementing the intrinsic as llvm.amdgcn.tbuffer.*
Added CodeGen tests for the 2 new variants added.
Left the original llvm.SI.tbuffer.store implementation to avoid issues with existing code
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr
Differential Revision: https://reviews.llvm.org/D30687
llvm-svn: 306031
A new Gfx9 dasm test added with approx 29000 cases.
Existing tests extended by (approx.):
* Gfx7 asm: 5000 test cases
* Gfx8 asm: 5000 test cases
* Gfx9 asm: 14400 test cases
* Gfx8 dasm: 5200 test cases
llvm-svn: 305702
Note that if we need the result of both the divide and the modulo then we
compute the modulo based on the result of the divide and not using the new
hardware instruction.
Commit on behalf of STEFAN PINTILIE.
Differential Revision: https://reviews.llvm.org/D33940
llvm-svn: 305210
Changed immediate type for repl.ph from uimm10 to simm10 as per the specs.
Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of
GNU as.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D33594
llvm-svn: 304918
This adds assembler / disassembler support for the decimal
floating-point instructions. Since LLVM does not yet have
support for decimal float types, these cannot be used for
codegen at this point.
llvm-svn: 304203
This adds assembler / disassembler support for the hexadecimal
floating-point instructions. Since the Linux ABI does not use
any hex float data types, these are not useful for codegen.
llvm-svn: 304202
AVX512_VPOPCNTDQ is a new feature set that was published by Intel.
The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq).
Differential Revision: https://reviews.llvm.org/D33169
llvm-svn: 303858
According to Power ISA V3.0 document, the first source operand of mtvsrdd is constant 0 if r0 is specified. So the corresponding register constraint should be g8rc_nox0.
This bug caused wrong output generated by 401.bzip2 when -mcpu=power9 and fdo are specified.
Differential Revision: https://reviews.llvm.org/D32880
llvm-svn: 302834
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.
Additionally actually using it requires changing the output register
class, which wasn't done anyway.
llvm-svn: 302814
This adds a few missing instructions for the assembler and
disassembler. Those should be the last missing general-
purpose (Chapter 7) instructions for the z10 ISA.
llvm-svn: 302667
This adds the remaining general arithmetic instructions
for assembler / disassembler use. Most of these are not
useful for codegen; a few might be, and those are listed
in the README.txt for future improvements.
llvm-svn: 302665
The assembler and disassmebler test cases started out formatted and
sorted in a particular way, but this got lost over time as patches
were added. Reformat them again. NFC.
llvm-svn: 302642
That's only a required extension as of v8.1a.
Remove it from the "generic" CPU as well: it should only support the
base ISA (and binutils agrees).
Also unify the MC tests into crc.s and arm64-crc32.s
llvm-svn: 302077
This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4).
Reapplied - this time without changing line endings of existing files.
Differential Revision: https://reviews.llvm.org/D32769
llvm-svn: 302041
This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4).
Differential Revision: https://reviews.llvm.org/D32769
llvm-svn: 302028
The unused dummy src2_modifiers is missing, so it crashes
when trying to print it.
I tried to fully remove src2_modifiers, but there are some
irritations in the places where it is converted to mad since
it starts to require modifying use lists while iterating over
them.
llvm-svn: 299861
- corrected DS_GWS_* opcodes (see VI_Shader_Programming#16.pdf for detailed description)
- address operand is not used
- several opcodes have data operand
- all opcodes have offset modifier
- DS_AND_SRC2_B32: corrected typo in mnemo
- DS_WRAP_RTN_F32 replaced with DS_WRAP_RTN_B32
- added CI/VI opcodes:
- DS_CONDXCHG32_RTN_B64
- DS_GWS_SEMA_RELEASE_ALL
- added VI opcodes:
- DS_CONSUME
- DS_APPEND
- DS_ORDERED_COUNT
Differential Revision: https://reviews.llvm.org/D31707
llvm-svn: 299767
mfvrd and mffprd are both alias to mfvrsd.
This patch enables correct parsing of the aliases, but we still emit a mfvrsd.
Committing on behalf of brunoalr (Bruno Rosa).
Differential Revision: https://reviews.llvm.org/D29177
llvm-svn: 297849
This patch does the following.
1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero
2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1)
3. Adds the clzero feature under znver1 architecture.
4. The custom inserter is added in Lowering.
5. A testcase is added to check the intrinsic.
6. The clzero instruction is added to assembler test.
Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me.
Differential revision: https://reviews.llvm.org/D29385
llvm-svn: 294558
This patch checks the number of operands in the resulting
instruction instead of just the alias, then skips over
tied operands when generating the printing method.
This allows us to generate the preferred assembly syntax
for the AArch64 'ins' instruction, which should always be
displayed as 'mov' according to the ARMARM.
Several unit tests have changed as a result, but only to
reflect the preferred disassembly.
Some other InstAlias patterns (movk/bic/orr) needed a
slight adjustment to stop them becoming the default
and breaking other unit tests.
Patch by Graham Hunter.
Differential Revision: https://reviews.llvm.org/D29219
llvm-svn: 294437
Summary:
Adds the following instructions:
* mfpmr
* mtpmr
* icblc
* icblq
* icbtls
Fix the scheduling for mtspr on e5500, which uses CFX0, instead of
SFX0/SFX1 as on e500mc.
Addresses PR 31538.
Differential Revision: https://reviews.llvm.org/D29002
llvm-svn: 293417
Summary: Small change to get the FREEP instruction to decode properly.
Reviewers: craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D29193
llvm-svn: 293314
Reason: broke ASAN bots with a global buffer overflow.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/2291
Each test contains 20-30K test cases but takes only several (from 4 to 10)
seconds to complete on average machine. The tests cover the majority of
AMDGPU Gfx7/Gfx8 instructions, including many dark corners, and intended
to quickly find out if something is broken.
llvm-svn: 292974
Each test contains 20-30K test cases but takes only several (from 4 to 10)
seconds to complete on average machine. The tests cover the majority of
AMDGPU Gfx7/Gfx8 instructions, including many dark corners, and intended
to quickly find out if something is broken.
llvm-svn: 292922
Permit explicit $fcc<X> operand in c.cond.fmt instruction.
Add c.cond.fmt to the MIPS to microMIPS instruction mapping table.
Check that $fcc1 - $fcc7 are unusable for MIPS-I to MIPS-III for
c.cond.fmt, bc1t, bc1f.
Reviewers: seanbruno, zoran.jovanovic, vkalintiris
Differential Revision: https://reviews.llvm.org/D24510
llvm-svn: 292117
Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands.
Reviewers: nhaustov, vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27847
llvm-svn: 290336
Since 32-bit instructions with 32-bit input immediate behavior
are used to materialize 16-bit constants in 32-bit registers
for 16-bit instructions, determining the legality based
on the size is incorrect. Change operands to have the size
specified in the type.
Also adds a workaround for a disassembler bug that
produces an immediate MCOperand for an operand that
is supposed to be OPERAND_REGISTER.
The assembler appears to accept out of bounds immediates and
truncates them, but this seems to be an issue for 32-bit
already.
llvm-svn: 289306
Add assembler support for all atomic instructions that weren't already
supported. Some of those could be used to implement codegen for 128-bit
atomic operations, but this isn't done here yet.
llvm-svn: 288526
Add assembler support for instructions manipulating the FPC.
Also add codegen support via the GCC compatibility builtins:
__builtin_s390_sfpc
__builtin_s390_efpc
llvm-svn: 288525
This adds assembler support for the instructions provided by the
execution-hint facility (NIAI and BP(R)P). This required adding
support for the new relocation types for 12-bit and 24-bit PC-
relative offsets used by the BP(R)P instructions.
llvm-svn: 288031
This patch adds assembler support for the remaining branch instructions:
the non-relative branch on count variants, and all variants of branch
on index.
The only one of those that can be readily exploited for code generation
is BRCTH (branch on count using a high 32-bit register as count). Do
use it, however, it is necessary to also introduce a hew CHIMux pseudo
to allow comparisons of a 32-bit value agains a short immediate to go
into a high register as well (implemented via CHI/CIH).
This causes a bit of codegen changes overall, but those have proven to
be neutral (or even beneficial) in performance measurements.
llvm-svn: 288029
This patch moves formation of LOC-type instructions from (late)
IfConversion to the early if-conversion pass, and in some cases
additionally creates them directly from select instructions
during DAG instruction selection.
To make early if-conversion work, the patch implements the
canInsertSelect / insertSelect callbacks. It also implements
the commuteInstructionImpl and FoldImmediate callbacks to
enable generation of the full range of LOC instructions.
Finally, the patch adds support for all instructions of the
load-store-on-condition-2 facility, which allows using LOC
instructions also for high registers.
Due to the use of the GRX32 register class to enable high registers,
we now also have to handle the cases where there are still no single
hardware instructions (conditional move from a low register to a high
register or vice versa). These are converted back to a branch sequence
after register allocation. Since the expandRAPseudos callback is not
allowed to create new basic blocks, this requires a simple new pass,
modelled after the ARM/AArch64 ExpandPseudos pass.
Overall, this patch causes significantly more LOC-type instructions
to be used, and results in a measurable performance improvement.
llvm-svn: 288028
This adds support for the compare logical and trap (memory)
instructions that were added as part of the miscellaneous
instruction extensions feature with zEC12.
llvm-svn: 286587
This adds support for the LZRF/LZRG/LLZRGF instructions that were
added on z13, and uses them for code generation were appropriate.
SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF
over RISBG where both would be possible.
llvm-svn: 286586
This adds support for the 31-to-64-bit zero extension instructions
LLGT and LLGTR and uses them for code generation where appropriate.
Since this operation can also be performed via RISBG, we have to
update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT
over RISBG in case both are possible. The patch includes some
simplification to the tryRISBGZero code; this is not intended
to cause any (further) functional change in codegen.
llvm-svn: 286585
This completes assembler / disassembler support for all BFP
instructions provided by the floating-point extensions facility.
The instructions added here are not currently used for codegen.
llvm-svn: 286285
Add several instructions that operate on the program mask
or the addressing mode. These are not really needed for
code generation under Linux, but are provided for completeness
for the assembler/disassembler.
llvm-svn: 286284
Add the 16 access registers as LLVM registers. This allows removing
a lot of special cases in the assembler and disassembler where we
were handling access registers; this can all just use the generic
register code now.
Also add a bunch of instructions to operate on access registers,
for assembler/disassembler use only. No change in code generation
intended.
llvm-svn: 286283
Rework patterns for branches, call & return instructions,
compare-and-branch, compare-and-trap, and conditional move
instructions.
In particular, simplify creation of patterns for the extended
opcodes of instructions that take a CC mask.
Also, use semantical instruction classes for all the instructions
instead of open-coding them in SystemZInstrInfo.td.
Adds a couple of the basic branch instructions (that are unused
for codegen) for the assembler/disassembler.
llvm-svn: 286263
Fixes Bug 30808.
Note that passing subtarget information to predicates seems too complicated, so gfx8-specific def smrd_offset_20 introduced.
Old gfx6/7-specific def renamed to smrd_offset_8 for clarity.
Lit tests updated.
Differential Revision: https://reviews.llvm.org/D26085
llvm-svn: 285590
LLVM currently treats the first operand of MVCK as if it were a
regular base+index+displacement address. However, it is in fact
a base+displacement combined with a length register field.
While the two might look syntactically similar, there are two
semantic differences:
- %r0 is a valid length register, even though it cannot be used
as an index register.
- In an expression with just a single register like 0(%rX), the
register is treated as base with normal addresses, while it is
treated as the length register (with an empty base) for MVCK.
Fixed by adding a new operand parser class BDRAddr and reworking
the assembler parser to distinguish between address + length
register operands and regular addresses.
llvm-svn: 285574
Most z13 vector instructions have a base form where the data type of
the operation (whether to consider the vector to be 16 bytes, 8
halfwords, 4 words, or 2 doublewords) is encoded into a mask field,
and then a set of extended mnemonics where the mask field is not
present but the data type is encoded into the mnemonic name.
Currently, LLVM only supports the type-specific forms (since those
are really the ones needed for code generation), but not the base
type-generic forms.
To complete the assembler support and make it fully compatible with
the GNU assembler, this commit adds assembler aliases for all the
base forms of the various vector instructions.
It also adds two more alias forms that are documented in the PoP:
VFPSO/VFPSODB/WFPSODB -- generic form of VFLCDB etc.
VNOT -- special variant of VNO
llvm-svn: 284586
The vfee[bhf], vfene[bhf], and vistr[bhf] assembler mnemonics are
documented in the Principles of Operation to have an optional last
operand to encode arbitrary values in a mask field.
This commit adds support for those optional operands, and cleans up
the patterns to generate vector string instruction as bit. No change
to code generation intended.
llvm-svn: 284585
For compatiblity with binutils, define these instructions to take
two registers with a 16bit unsigned immediate. Both of the registers
have to be same for dahi and dati.
Reviewers: dsanders, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D21473
llvm-svn: 284218
These instructions were only defined for microMIPSR6 previously. Add
definitions for MIPSR6, correct definitions for microMIPSR6, flag these
instructions as having unmodelled side effects (they disable/enable
virtual processors) and add missing disassember tests for microMIPSR6.
Reviewers: vkalintiris
Differential Review: https://reviews.llvm.org/D24291
llvm-svn: 284115
Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for
architecture support and register usage.
Reviewers: vkalintiris, zoran.jovanoic
Differential Review: https://reviews.llvm.org/D24499
llvm-svn: 283334
Add rsqrt.[ds], recip.[ds] for MIPS. Correct the microMIPS definitions for
architecture support and register usage.
Reviewers: vkalintiris, zoran.jovanoic
Differential Review: https://reviews.llvm.org/D24499
llvm-svn: 282485
For compatiblity with binutils, define these instructions to take
two registers with a 16bit unsigned immediate. Both of the registers
have to be same for dahi and dati.
Reviewers: vkalintiris, dsanders, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D21473
llvm-svn: 281724
dcbf has an optional hint-like field, add support for the extended form and the
associated mnemonics (dcbfl and dcbflp).
Partially fixes PR24796.
llvm-svn: 280559
Summary:
Add instruction formats E, RSI, SSd, SSE, and SSF.
Added BRXH, BRXLE, PR, MVCK, STRAG, and ECTG instructions to test out
those formats.
Reviewers: uweigand
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23179
llvm-svn: 277822
Summary:
This is one possible solution to the problem of ignoring constraints that Simon
raised in D21473 but it's a bit of a hack.
The integrated assembler currently ignores violations of the tied register
constraints when the operands involved in a tie are both present in the AsmText.
For example, 'dati $rs, $rt, $imm' with the '$rs = $rt' will silently replace
$rt with $rs. So 'dati $2, $3, 1' is processed as if the user provided
'dati $2, $2, 1' without any diagnostic being emitted.
This is difficult to solve properly because there are multiple parts of the
matcher that are silently forcing these constraints to be met. Tied operands are
rendered to instructions by cloning previously rendered operands but this is
unnecessary because the matcher was already instructed to render the operand it
would have cloned. This is also unnecessary because earlier code has already
replaced the MCParsedOperand with the one it was tied to (so the parsed input
is matched as if it were 'dati <RegIdx 2>, <RegIdx 2>, <Imm 1>'). As a result,
it looks like fixing this properly amounts to a rewrite of the tied operand
handling which affects all targets.
This patch however, merely inserts a checking hook just before the
substitution of MCParsedOperands and the Mips target overrides it. It's not
possible to accurately check the registers are the same this early (because
numeric registers haven't been bound to a register class yet) so it cheats a
bit and checks that the tokens that produced the operand are lexically
identical. This works because tied registers need to have the same register
class but it does have a flaw. It will reject 'dati $4, $a0, 1' for violating
the constraint even though $a0 ends up as the same register as $4.
Reviewers: sdardis
Subscribers: dsanders, llvm-commits, sdardis
Differential Revision: https://reviews.llvm.org/D21994
llvm-svn: 276867
The saturation instructions appeared in v6T2, with DSP extensions, but they
were being accepted / generated on any, with the new introduction of the
saturation detection in the back-end. This commit restricts the usage to
DSP-enable only cores.
Fixes PR28607.
llvm-svn: 276701
Summary: Add support for the z13 instructions LOCHI and LOCGHI which
conditionally load immediate values. Add target instruction info hooks so
that if conversion will allow predication of LHI/LGHI.
Author: RolandF
Reviewers: uweigand
Subscribers: zhanjunl
Commiting on behalf of Roland.
Differential Revision: http://reviews.llvm.org/D22117
llvm-svn: 275086
The way the named arguments for various system instructions are handled at the
moment has a few problems:
- Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp
- That weird Mapping class that I have no idea what I was on when I thought
it was a good idea.
- Searches are performed linearly through the entire list.
- We print absolutely all registers in upper-case, even though some are
canonically mixed case (SPSel for example).
- The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated
to comments in our implementation, with a slightly opaque hex value
indicating the canonical encoding LLVM will use.
This adds a new TableGen backend to produce efficiently searchable tables, and
switches AArch64 over to using that infrastructure.
llvm-svn: 274576
The backend has been around for years, it's pretty ridiculous that we can't
even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen
can't handle the complex predicates when printing so it's a bunch of nasty C++.
Oh well.
llvm-svn: 272865
Support and generate Compare and Traps like CRT, CIT, etc.
Support Trap as legal DAG opcodes and generate "j .+2" for them by default.
Add support for Conditional Traps and use the If Converter to convert them into
the corresponding compare and trap opcodes.
Differential Revision: http://reviews.llvm.org/D21155
llvm-svn: 272419
Another step for unification llvm assembler/disassembler with sp3.
Besides, CodeGen output is a bit improved, thus changes in CodeGen tests.
Assembler/Disassembler tests updated/added.
Differential Revision: http://reviews.llvm.org/D20796
llvm-svn: 271900
new instruction to ARM and AArch64 targets and several system registers.
Patch by: Roger Ferrer Ibanez and Oliver Stannard
Differential Revision: http://reviews.llvm.org/D20282
llvm-svn: 271670
Patch by Nitesh Jain.
Summary: The type of Imm in MipsDisassembler.cpp was incorrect since SignExtend64 return int64_t type.As per the MIPSr6 doc ,the offset is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address hence “4” is added to the offset. The offset of some test case are update to reflect the changes due to “ + 4 ” offset and new test case for negative offset are added.
Reviewers: dsanders, vkalintiris
Differential Revision: http://reviews.llvm.org/D17540
llvm-svn: 270542
Fixes for MUBUF_Atomic instructions to make operand list valid:
- For RTN insns, make a copy of $vdata_in operand as $vdata.
- Do not add operand for GLC, it is hardcoded and comes as a token.
Workaround to avoid adding multiple default optional operands.
Tests added.
Differential Revision: http://reviews.llvm.org/D20257
llvm-svn: 270049
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction execution
and enter an implementation-dependent optimized state until occurrence of a
class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Patch by Ganesh Gopalasubramanian!
Reviewers: echristo, craig.topper, RKSimon
Subscribers: RKSimon, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19795
llvm-svn: 269911
Summary: On Linux, /usr/include/bits/byteswap-16.h defines __byteswap_16(x) as an inlined LRVH (Load Reversed Half-word) instruction. The SystemZ back-end did not support this opcode and the inlined assembly would cause a fatal error.
Reviewers: bryanpkc, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18732
llvm-svn: 269688
Most immediates are printed in Aarch64InstPrinter using 'formatImm' macro,
but not all of them.
Implementation contains following rules:
- floating point immediates are always printed as decimal
- signed integer immediates are printed depends on flag settings
(for negative values 'formatImm' macro prints the value as i.e -0x01
which may be convenient when imm is an address or offset)
- logical immediates are always printed as hex
- the 64-bit immediate for advSIMD, encoded in "a🅱️c:d:e:f:g:h" is always printed as hex
- the 64-bit immedaite in exception generation instructions like:
brk, dcps1, dcps2, dcps3, hlt, hvc, smc, svc is always printed as hex
- the rest of immediates is printed depends on availability
of -print-imm-hex
Signed-off-by: Maciej Gabka <maciej.gabka@arm.com>
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
Differential Revision: http://reviews.llvm.org/D16929
llvm-svn: 269446
Added support for sendmsg(MSG[, OP[, STREAM_ID]]) syntax
in s_sendmsg and s_sendmsghalt instructions.
The syntax matches the SP3 assembler/disassembler rules.
That is why implicit inputs (like M0 and EXEC) are not printed
to disassembly output anymore.
sendmsg(...) allows only known message types and attributes,
even if literals are used instead of symbolic names.
However, raw literal (without "sendmsg") still can be used,
and that allows for any 16-bit value.
Tests updated/added.
Differential Revision: http://reviews.llvm.org/D19596
llvm-svn: 268762
Summary:
The goal is for each operand type to have its own parse function and
at the same time share common code for tracking state as different
instruction types share operand types (e.g. glc/glc_flat, etc).
Introduce parseAMDGPUOperand which can parse any optional operand.
DPP and Clamp/OMod have custom handling for now. Sam also suggested
to have class hierarchy for operand types instead of table. This
can be done in separate change.
Remove parseVOP3OptionalOps, parseDS*OptionalOps, parseFlatOptionalOps,
parseMubufOptionalOps, parseDPPOptionalOps.
Reduce number of definitions of AsmOperand's and MatchClasses' by using common base class.
Rename AsmMatcher/InstPrinter methods accordingly.
Print immediate type when printing parsed immediate operand.
Use 'off' if offset/index register is unused instead of skipping it to make it more readable (also agreed with SP3).
Update tests.
Reviewers: tstellarAMD, SamWot, artem.tamazov
Subscribers: qcolombet, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19584
llvm-svn: 268015
Revert "[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance".
This patch has caused a functional regression in SPEC2k6 namd, and a performance regression in mesa-pipe.
llvm-svn: 267927
Possibility to specify code of hardware register kept.
Disassemble to symbolic name, if name is known.
Tests updated/added.
Differential Revision: http://reviews.llvm.org/D19335
llvm-svn: 267724
Commit r266977 was reason for failing LLVM test suite with error message: fatal error: error in backend: Cannot select: t17: i32 = rotr t2, t11 ...
llvm-svn: 267418
Order should match the sp3 syntax, where destination (simm16 denoting the hwreg) is coming first.
Differential Revision: http://reviews.llvm.org/D19161
llvm-svn: 266617
Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like
binutils.
This patch was previous committed as r266055 as seemed to have caused some spurious
test failures. They did not reappear after further local testing.
llvm-svn: 266301
This patch corresponds to review:
http://reviews.llvm.org/D17850
This patch implements the following instructions:
cmprb, cmpeqb, cnttzw, cnttzw., cnttzd, cnttzd.
llvm-svn: 266228
Differential Revision: http://reviews.llvm.org/D17137
This patch was reverted after the revertion of dependant patch http://reviews.llvm.org/D17068.
There was the problem with test-suite failure.
The problem is hopefully solved with dependant patch so this patch is commited again.
llvm-svn: 266179
Summary:
Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like
binutils.
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D18856
llvm-svn: 266055
For VGPR_32 operand disassembler expects a VGPR register encoded as 0..255 (enum8 src operand).
readfirstlane/readline actually has enum9 operand and this change fixes VGPR_32 to VS_32 (enum9 encoding).
Differential Revision: http://reviews.llvm.org/D18696
llvm-svn: 265670
This patch implements the following BookII and Book III instructions:
- copy copy_first cp_abort paste paste. paste_last
- msgsync
- slbieg slbsync
- stop
Total 10 instructions
Reviewers: nemanjai hfinkel tjablin amehsan kbarton
llvm-svn: 265504
This adds MC support for fused compare + indirect branch instructions,
ie. CRB, CGRB, CLRB, CLGRB, CIB, CGIB, CLIB, CLGIB. They aren't actually
generated yet -- this is preparation for their use for conditional
returns in the next iteration of D17339.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18742
llvm-svn: 265296
This patch corresponds to review:
http://reviews.llvm.org/D18032
This patch provides asm implementation for the following instructions:
lwat, ldat, stwat, stdat, ldmx, mcrxrx
llvm-svn: 265022
Autogenerated from the corresponding assembler tests with a few FIXME added (will fix soon).
Differential Revision: http://reviews.llvm.org/D18249
llvm-svn: 263729
This will allow inline assembler code to utilize these features, but no automatic lowering is provided, except for the previously provided @llvm.trap, which lowers to "ta 5".
The change also separates out the different assembly language syntaxes for V8 and V9 Sparc. Previously, only V9 Sparc assembly syntax was provided.
The change also corrects the selection order of trap disassembly, allowing, e.g. "ta %g0 + 15" to be rendered, more readably, as "ta 15", ignoring the %g0 register. This is per the sparc v8 and v9 manuals.
Check-in includes many extra unit tests to check this works correctly on both V8 and V9 Sparc processors.
Code Reviewed at http://reviews.llvm.org/D17960.
llvm-svn: 263044
Summary:
The bug was that dextu's operand 3 would print 0-31 instead of 32-63 when
printing assembly. This came up when replacing
MipsInstPrinter::printUnsignedImm() with a version that could handle arbitrary
bit widths.
MipsAsmPrinter::printUnsignedImm*() don't seem to be used so they have been
removed.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15521
llvm-svn: 262231
These are all co-processor registers, with the exception of the floating-point deferred-trap queue register.
Although these will not be lowered automatically by any instructions, it allows the use of co-processor
instructions implemented by inline-assembly.
Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td,
which was formerly causing a problem in the disassembly of the %fq register.
llvm-svn: 262133
Support all instructions with VOP1 encoding with 32 or 64-bit operands for VI subtarget:
VGPR_32 and VReg_64 operand register classes
VS_32 and VS_64 operand register classes with inline and literal constants
Tests for VOP1 instructions.
Patch by: skolton
Reviewers: arsenm, tstellarAMD
Review: http://reviews.llvm.org/D17194
llvm-svn: 261878
Changes:
- Added disassembler project
- Fixed all decoding conflicts in .td files
- Added DecoderMethod=“NONE” option to Target.td that allows to
disable decoder generation for an instruction.
- Created decoding functions for VS_32 and VReg_32 register classes.
- Added stubs for decoding all register classes.
- Added several tests for disassembler
Disassembler only supports:
- VI subtarget
- VOP1 instruction encoding
- 32-bit register operands and inline constants
[Valery]
One of the point that requires to pay attention to is how decoder
conflicts were resolved:
- Groups of target instructions were separated by using different
DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate
approach.
- There were conflicts in IMAGE_<> instructions caused by two
different reasons:
1. dmask wasn’t specified for the output (fixed)
2. There are image instructions that differ only by the number of
the address components but have the same encoding by the HW spec. The
actual number of address components is determined by the HW at runtime
using image resource descriptor starting from the VGPR encoded in an
IMAGE instruction. This means that we should choose only one instruction
from conflicting group to be the rule for decoder. I didn’t find the way
to disable decoder generation for an arbitrary instruction and therefore
made a onelinear fix to tablegen generator that would suppress decoder
generation when DecoderMethod is set to “NONE”. This is a change that
should be reviewed and submitted first. Otherwise I would need to
specify different DecoderNamespace for every instruction in the
conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not
used in other targets.
3. IMAGE_GATHER decoder generation is for now disabled and to be
done later.
[/Valery]
Patch By: Sam Kolton
Differential Revision: http://reviews.llvm.org/D16723
llvm-svn: 261185
Summary:
The bugs were:
* teq and similar take 4-bit unsigned immediates on microMIPS.
* teqi and similar have side-effects like teq do.
* shll_s.w and shra_r.w take 5-bit unsigned immediates.
* The various DSP ext* instructions take a 5-bit immediate.
* repl.qh takes an 8-bit unsigned immediate.
* repl.ph takes a 10-bit unsigned immediate.
* rddsp/wrdsp take a 10-bit unsigned immediate.
* teqi and similar take signed 16-bit immediates (10-bit for microMIPS).
* Out-of-range immediate macros for or/xor take a simm32/simm64 depending
on architecture. I'll fix the simm64 case properly when I reach simm32.
lui is a bit more lenient than GAS and accepts signed immediates in addition
to unsigned. This is because MipsMCExpr can produce signed values when
constant folding and it currently lacks a way of knowing it should fold to
an unsigned value.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15446
llvm-svn: 259360
This patch was originally committed as r257884, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.
llvm-svn: 258682
This was originally committed as r255762, but reverted as it broke windows
bots. Re-commitiing the exact same patch, as the underlying cause was fixed by
r258677.
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.
These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.
Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.
New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.
Differential Revision: http://reviews.llvm.org/D15038
llvm-svn: 258678
# The first commit's message is:
Revert "[ARM] Add DSP build attribute and extension targeting"
This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc.
# This is the 2nd commit message:
Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"
This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5.
llvm-svn: 257916
Summary:
It actually takes an offset into the current PC-region.
This fixes the 'expr' command in lldb.
Reviewers: vkalintiris, jaydeep, bhushan
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D16054
llvm-svn: 257339
Summary:
There are a number of files in the tree which have been accidentally checked in with DOS line endings. Convert these to native line endings.
There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those.
Reviewers: joerg, aaron.ballman
Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D15848
llvm-svn: 256707
ARMv8.2-A adds 16-bit floating point versions of all existing SIMD
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
Note that VFP without SIMD is not a valid combination for any version of
ARMv8-A, but I have ensured that these instructions all depend on both
FeatureNEON and FeatureFullFP16 for consistency.
Differential Revision: http://reviews.llvm.org/D15039
llvm-svn: 255764
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.
These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.
Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.
New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.
Differential Revision: http://reviews.llvm.org/D15038
llvm-svn: 255762
Commited patch was intended to implement LH, LHE, LHU and LHUE instructions.
After commit test-suite failed with error message in the form of:
fatal error: error in backend: Cannot select: t124: i32,ch = load<LD2[%d](tbaa=<0x94acc48>), sext from i16> t0, t2, undef:i32
For that reason I decided to revert commit r254897 and make new patch which besides implementation and standard regression tests will also have dedicated tests (CodeGen) for the above error.
llvm-svn: 255109
ARMv8.2-A adds 16-bit floating point versions of all existing SIMD
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
Note that VFP without SIMD is not a valid combination for any version of
ARMv8-A, but I have ensured that these instructions all depend on both
FeatureNEON and FeatureFullFP16 for consistency.
The ".2h" vector type specifier is now legal (for the scalar pairwise
reduction instructions), so some unrelated tests have been modified as
different error messages are emitted. This is not a problem as the
invalid operands are still caught.
llvm-svn: 255010
The Statistical Profiling Extension is an optional extension to
ARMv8.2-A. Since it is an optional extension, I have added the
FeatureSPE subtarget feature to control it. The assembler-visible parts
of this extension are the new "psb csync" instruction, which is
equivalent to "hint #17", and a number of system registers.
Differential Revision: http://reviews.llvm.org/D15021
llvm-svn: 254401
Value of offset operand for microMIPS BALC and BC instructions is currently shifted 2 bits, but it should be 1 bit.
Differential Revision: http://reviews.llvm.org/D14770
llvm-svn: 254296
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.
Most of these instructions are the same as the 32- and 64-bit versions,
but with the type field (bits 23-22) set to 0b11. Previously the top bit
of the size field was always 0, so the instruction classes only provided
a 1-bit size field, which I have widened to 2 bits.
Differential Revision: http://reviews.llvm.org/D15014
llvm-svn: 254198
ARMv8.2-A adds new variants of the "at" (address translate) system
instruction, which take the PSTATE.PAN bit (added in ARMv8.1-A). These
are a required part of ARMv8.2-A, so no additional subtarget features
are required.
Differential Revision: http://reviews.llvm.org/D15018
llvm-svn: 254159
ARMv8.2-A adds a new PSTATE bit, PSTATE.UAO, which allows the LDTR/STTR
instructions to behave the same as LDR/STR with respect to execute-only
pages at higher privilege levels. New variants of the MSR/MRS
instructions are added to allow reading and writing this bit. It is a
required part of ARMv8.2-A, so no additional subtarget features are
required.
Differential Revision: http://reviews.llvm.org/D15020
llvm-svn: 254157
ARMv8.2-A adds the "dc cvap" instruction, which is a system instruction
that cleans caches to the point of persistence (for systems that have
persistent memory). It is a required part of ARMv8.2-A, so no additional
subtarget features are required.
Differential Revision: http://reviews.llvm.org/D15016
llvm-svn: 254156
ARMv8.2-A adds a new ID register, ID_A64MMFR2_EL1, which behaves in the
same way as ID_A64MMFR0_EL1 and ID_A64MMFR1_EL1. It is a required part
of ARMv8.2-A, so no additional subtarget features are required.
Differential Revision: http://reviews.llvm.org/D15017
llvm-svn: 254155
Summary:
The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA
(unlike the MSA version) failed to account for the off-by-one encoding of the
immediate. The range is actually 1..4 rather than 0..3.
Reviewers: vkalintiris
Subscribers: atanasyan, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14015
llvm-svn: 252295
Summary:
This patch handles assembly and disassembly, but not codegen, as of yet.
Additionally, it fixes a bug whereby SP and PC as shifted-reg operands
were treated as predictable in ARMv7 Thumb; and it enables the tests
for invalid and unpredictable instructions to run on both ARMv7 and ARMv8.
Reviewers: jmolloy, rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D14141
llvm-svn: 251516
cntlz is the old POWER mnemonic. cntlzw is the PowerPC mnemonic.
This change fixes an issue when -no-integrated-as: The opcode cntlz is
unrecognized by gas
Alias the POWER mnemonic cntlz[.] to the PowerPC mnemonic cntlzw[.]
This is done for because the POWER cntlz mnemonic has be used by LLVM for
a very long time. We need to make sure that assembly programs
that are using the cntlz[.] do not break with this change.
Change PowerPC tests to reflect the insn change from cntlz to cntlzw.
Add assembly test to verify cntlz[.] is encoded correctly.
Patch by Tom Rix!
llvm-svn: 251489
Summary:
The forwards compatibility strategy employed by MIPS is to consider registers
to be infinitely sign-extended. Then on ISA's with a wider register, the result
of existing instructions are sign-extended to register width and zero-extended
counterparts are added. copy_u.w on MSA32 and copy_u.w on MSA64 violate this
strategy and we have therefore corrected the MSA specs to fix this.
We still keep track of sign/zero-extension during legalization but we now
match copy_s.[wd] where required.
No change required to clang since __builtin_msa_copy_u_[wd] will map to
copy_s.[wd] where appropriate for the target.
Reviewers: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13472
llvm-svn: 250887