In order to properly implement these atomic we need one register more than other
binary atomics. It is used for storing result from comparing values in addition
to the one that is used for actual result of operation.
https://reviews.llvm.org/D71028
`saa` and `saad` are 32-bit and 64-bit store atomic add instructions.
memory[base] = memory[base] + rt
These instructions are available for "Octeon+" CPU. The patch adds support
for both instructions to MIPS assembler and diassembler and introduces new
CPU type - "octeon+".
Next patches will implement `.set arch=octeon+` directive and `AFL_EXT_OCTEONP`
ISA extension flag support.
Differential Revision: https://reviews.llvm.org/D69849
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
If result of 64-bit address loading combines with 32-bit mask, LLVM
tries to optimize the code and remove "redundant" loading of upper
32-bits of the address. It leads to incorrect code on MIPS64 targets.
MIPS backend creates the following chain of commands to load 64-bit
address in the `MipsTargetLowering::getAddrNonPICSym64` method:
```
(add (shl (add (shl (add %highest(sym), %higher(sym)),
16),
%hi(sym)),
16),
%lo(%sym))
```
If the mask presents, LLVM decides to optimize the chain of commands. It
really does not make sense to load upper 32-bits because the 0x0fffffff
mask anyway clears them. After removing redundant commands we get this
chain:
```
(add (shl (%hi(sym), 16), %lo(%sym))
```
There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in `SYM_32`
predicate definition, backend incorrectly selects a pattern for a 32-bit
symbols and uses the `lui` instruction for loading `%hi(sym)`.
As a result we get incorrect set of instructions with unnecessary 16-bit
left shifting:
```
lui at,0x0
R_MIPS_HI16 foo
dsll at,at,0x10
daddiu at,at,0
R_MIPS_LO16 foo
```
This patch resolves two problems:
- Fix `SYM_32/SYM_64` predicates to prevent selection of patterns dedicated
to 32-bit symbols in case of using N64 ABI.
- Add missed patterns for 64-bit symbols for `%hi/%lo`.
Fix PR42736.
Differential Revision: https://reviews.llvm.org/D66228
llvm-svn: 370268
The `sge/sgeu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than or equal `Src2/Imm` and
to 0 otherwise.
Differential Revision: https://reviews.llvm.org/D64314
llvm-svn: 365476
The `sgt/sgtu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than `Src2/Imm` and to 0 otherwise.
Differential Revision: https://reviews.llvm.org/D64313
llvm-svn: 365475
Add `IsGP64bit` and `IsPTR64bit` to the list of `UnsupportedFeatures`
of the P5600 scheduling definitions. Also mark some MIPS 64-bit
instructions by PTR_64 and GPR_64 predicates. This reduces number
of "No schedule information for" and "lacks information for" errors
in case of marking this scheduler model as complete.
This patch is one of a series of patches. The goal is to make P5600
scheduler model complete and turn on the `CompleteModel` flag.
Differential Revision: https://reviews.llvm.org/D63237
llvm-svn: 363702
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
Expand on LONG_BRANCH_LUi and LONG_BRANCH_(D)ADDiu pseudo
instructions by creating variants which support
less operands/accept GPR64Opnds as their operand in order
to appease the machine verifier pass.
Differential Revision: https://reviews.llvm.org/D53977
llvm-svn: 346133
MIPS ISAs start to support third operand for the `rdhwr` instruction
starting from Revision 6. But LLVM generates assembler code with
three-operands version of this instruction on any MIPS64 ISA. The third
operand is always zero, so in case of direct code generation we get
correct code.
This patch fixes the bug by adding an instruction alias. The same alias
already exists for 32-bit ISA.
Ideally, we also need to reject three-operands version of the `rdhwr`
instruction in an assembler code if ISA revision is less than 6. That is
a task for a separate patch.
This fixes PR38861 (https://bugs.llvm.org/show_bug.cgi?id=38861)
Differential revision: https://reviews.llvm.org/D51773
llvm-svn: 341919
Override getTypeForExtReturn so that functions returning
an i32 typed value have it sign extended on MIPS64.
Also provide patterns to get rid of unneeded sign extensions
for arithmetic instructions which implicitly sign extend
their results.
Differential Revision: https://reviews.llvm.org/D48374
llvm-svn: 338019
For the final DTPREL addition, rather than a lui/daddiu/daddu triple,
LLVM was erronously emitting a daddiu/daddiu pair, treating the %dtprel_hi
as if it were a %dtprel_lo, since Mips::Hi expands unshifted for Sym64.
Instead, use a new TlsHi node and, although unnecessary due to the exact
structure of the nodes emitted, use TlsHi for local exec too to prevent
future bugs. Also garbage-collect the unused TprelLo and TlsGd nodes,
and TprelHi since its functionality is provided by the new common TlsHi node.
Patch by James Clarke.
Differential revision: https://reviews.llvm.org/D49259
llvm-svn: 337827
Instead, the pattern is tagged with the correct predicate when
it is declared. Some patterns have been duplicated as necessary.
Patch by Simon Dardis.
Differential revision: https://reviews.llvm.org/D48365
llvm-svn: 337171
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the stores can cause the
atomic sequence to fail.
This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.
This version addresses issues with the initial implementation and covers
all atomic operations.
This resolves PR/32020.
Thanks to James Cowgill for reporting the issue!
Patch By: Simon Dardis
Differential Revision: https://reviews.llvm.org/D31287
llvm-svn: 336328
This property is needed in order to follow values movement between
registers. This property is used in TII to implement method that
returns true if simple copy like instruction is recognized, along
with source and destination machine operands.
Patch by Nikola Prica.
Differential Revision: https://reviews.llvm.org/D45204
llvm-svn: 333093
This is a follow up to the rL330983. The patch teaches ld, sd, and lld
commands accept 32-bit memory offsets by replacing `mem_simm16` operand
to `mem_simmptr`. In fact, these commands should accept 64-bit offsets,
but so large offsets require another command expanding and will be
supported by a separate patch.
Differential Revision: https://reviews.llvm.org/D46629
llvm-svn: 331997
Guard the MIPS64 variant correctly for i64, mark the MIPS32 version as not
in microMIPS and provide the microMIPS version.
Additionally, remove a related stale XFAIL'd test as bswap has its own test
case providing coverage.
Reviewers: smaksimovic, abeserminji, atanasyan
Differential Revision: https://reviews.llvm.org/D45816
llvm-svn: 330705
This patch provides mitigation for CVE-2017-5715, Spectre variant two,
which affects the P5600 and P6600. It implements the LLVM part of
-mindirect-jump=hazard. It is _not_ enabled by default for the P5600.
The migitation strategy suggested by MIPS for these processors is to use
hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard
barrier variants of the 'jalr' and 'jr' instructions respectively.
These instructions impede the execution of instruction stream until
architecturally defined hazards (changes to the instruction stream,
privileged registers which may affect execution) are cleared. These
instructions in MIPS' designs are not speculated past.
These instructions are used with the attribute +use-indirect-jump-hazard
when branching indirectly and for indirect function calls.
These instructions are defined by the MIPS32R2 ISA, so this mitigation
method is not compatible with processors which implement an earlier
revision of the MIPS ISA.
Performance benchmarking of this option with -fpic and lld using
-z hazardplt shows a difference of overall 10%~ time increase
for the LLVM testsuite. Certain benchmarks such as methcall show a
substantially larger increase in time due to their nature.
Reviewers: atanasyan, zoran.jovanovic
Differential Revision: https://reviews.llvm.org/D43486
llvm-svn: 325653
All files and parts of files related to microMIPS4R6 are removed.
When target is microMIPS4R6, errors are printed.
This is LLVM part of patch.
Differential Revision: https://reviews.llvm.org/D35625
llvm-svn: 320350
Change the ISel matching of 'ins', 'dins[mu]' from tablegen code to
C++ code. This resolves an issue where ISel would select 'dins' instead
of 'dinsm' when the instructions size and position were individually in
range but their sum was out of range according to the ISA specification.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D39117
llvm-svn: 317331
The other members of the dext family of instructions (dextm, dextu) are
traditionally handled by the assembler selecting the right variant of
'dext' depending on the values of the position and size operands.
When these instructions are disassembled, rather than reporting the
actual instruction, an equivalent aliased form of 'dext' is generated
and is reported. This is to mimic the behaviour of binutils.
Reviewers: slthakur, nitesh.jain, atanasyan
Differential Revision: https://reviews.llvm.org/D34887
llvm-svn: 313276
Traditionally GAS has provided automatic selection between dins, dinsm and
dinsu. Binutils also disassembles all instructions in that family as 'dins'
rather than the actual instruction.
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D34877
llvm-svn: 313267
This patch complements D16810 "[mips] Make isel select the correct DEXT variant
up front.". Now ISel picks the right variant of DINS, so now there is no need
to replace DINS with the appropriate variant during
MipsMCCodeEmitter::encodeInstruction().
This patch also enables target specific instruction verification for ins, dins,
dinsm, dinsu, ext, dext, dextm, dextu. These instructions have constraints that
are checked when generating MipsISD::Ins and MipsISD::Ext nodes, but these
constraints are not checked during instruction selection. Adding machine
verification should catch outstanding cases.
Finally, correct a bug that instruction verification uncovered, where the
position operand of a DINSU generated during lowering was being silently
and accidently corrected to the correct value.
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D34809
llvm-svn: 313254
This patch corrects the definition of the DINSM instruction.
Specification for DINSM instruction for Mips64 says that size operand should
be 2 <= size <= 64, but it is defined as uimm5_inssize_plus1 which gives
range of 1 .. 32.
Patch by Aleksandar Beserminji.
Differential Revision: https://reviews.llvm.org/D37683
llvm-svn: 313149
This patch supports one more pattern for bbit0 and bbit1
instructions, CBranchBitNum class is expanded so it can
take 32 bit immidate.
Differential Revision: https://reviews.llvm.org/D36222
llvm-svn: 312111
Post commit review of rL308619 highlighted the need for handling N64
with -fno-pic. Testing reveale a stale assert when generating a GP
relative addressing mode.
This patch removes that assert and adds the necessary patterns for
MIPS64 to perform gp relative addressing with -fno-pic
(and the implicit -mno-abicalls + -mgpopt).
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D36472
llvm-svn: 310713
Add the instruction aliases for ds(r|l)l for the two operand alias
of ds(r|l)lv and the aliases ds(r|l)l with the three register operands.
llvm-svn: 306405
This patch adds support for recognizing more patterns to match to DEXT and
CINS instructions.
It finds cases where multiple instructions could be replaced with a single
DEXT or CINS instruction.
For example, for the following:
define i64 @dext_and32(i64 zeroext %a) {
entry:
%and = and i64 %a, 4294967295
ret i64 %and
}
instead of generating:
0000000000000088 <dext_and32>:
88: 64010001 daddiu at,zero,1
8c: 0001083c dsll32 at,at,0x0
90: 6421ffff daddiu at,at,-1
94: 03e00008 jr ra
98: 00811024 and v0,a0,at
9c: 00000000 nop
the following gets generated:
0000000000000068 <dext_and32>:
68: 03e00008 jr ra
6c: 7c82f803 dext v0,a0,0x0,0x20
Cases that are covered:
DEXT:
1. and $src, mask where mask > 0xffff
2. zext $src zero extend from i32 to i64
CINS:
1. and (shl $src, pos), mask
2. shl (and $src, mask), pos
3. zext (shl $src, pos) zero extend from i32 to i64
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D30464
llvm-svn: 297832
The fix introduces segfaults and clobbers the value to be stored when
the atomic sequence loops.
Revert "[Target/MIPS] Kill dead code, no functional change intended."
This reverts commit r296153.
Revert "Recommit "[mips] Fix atomic compare and swap at O0.""
This reverts commit r296134.
llvm-svn: 297380
This time with the missing files.
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the store can cause the
atomic sequence to fail.
This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.
This resolves PR/32020.
Thanks to James Cowgill for reporting the issue!
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D30257
llvm-svn: 296134
Similar to PR/25526, fast-regalloc introduces spills at the end of basic
blocks. When this occurs in between an ll and sc, the store can cause the
atomic sequence to fail.
This patch fixes the issue by introducing more pseudos to represent atomic
operations and moving their lowering to after the expansion of postRA
pseudos.
This resolves PR/32020.
Thanks to James Cowgill for reporting the issue!
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D30257
llvm-svn: 296132
Previously LLVM was assuming 32-bit signed immediates which results in and with
a bitmask that has bit 31 set to incorrectly include bits 63-32 in the result.
After applying this patch I can now compile all of the FreeBSD mips assembly
code with clang.
This issue also affects the nor, slt and sltu macros and I will fix those in a
separate review.
Patch By: Alexander Richardson
Commit message reformatted by sdardis.
Reviewers: atanasyan, theraven, sdardis
Differential Revision: https://reviews.llvm.org/D30298
llvm-svn: 296125
Clean up the implementation of divide macro expansion by getting rid of a
FIXME regarding magic numbers and branch instructions. Match GAS' behaviour
for expansion of ddiv / div in the two and three operand cases. Add the two
operand alias for MIPSR6. Finally, optimize macro expansion cases where the
divisior is the $zero register.
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D29887
llvm-svn: 294960