Consider large operands in G_MERGE_VALUES and G_UNMERGE_VALUES as
Ambiguous during regbank selection.
Introducing new InstType AmbiguousWithMergeOrUnmerge which will
allow us to recognize whether to narrow scalar or use s64:fprb.
This change exposed a bug when reusing data from TypeInfoForMF.
Thus when Instr is about to get destroyed (using narrow scalar)
clear its data in TypeInfoForMF. Internal data is saved based on
Instr's address, and it will no longer be valid.
Add detailed asserts for InstType and operand size.
Generate generic instructions instead of MIPS target instructions
during argument lowering and custom legalizer.
Select G_UNMERGE_VALUES and G_MERGE_VALUES when proper banks are
selected: {s32:gprb, s32:gprb, s64:fprb} for G_UNMERGE_VALUES and
{s64:fprb, s32:gprb, s32:gprb} for G_MERGE_VALUES.
Update tests. One improvement is when floating point argument in
gpr(or two gprs) gets passed to another function through gpr
unnecessary fpr-to-gpr moves are no longer generated.
Differential Revision: https://reviews.llvm.org/D74623
Like COPY instructions explained in D70616, we don't check the constraints
when combining G_UNMERGE_VALUES. Use the same logic used in D70616 to check
if registers can be replaced, or a COPY instruction needs to be built.
https://reviews.llvm.org/D70564
Summary:
Add a new method (tryParseRegister) that attempts to parse a register specification.
MASM allows the use of IFDEF <register>, as well as IFDEF <symbol>. To accommodate this, we make it possible to check whether a register specification can be parsed at the current location, without failing the entire parse if it can't.
Reviewers: thakis
Reviewed By: thakis
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73486
New intrinisics are implemented for when we need to port SIMD code from other
arhitectures and only load or store portions of MSA registers.
Following intriniscs are added which only load/store element 0 of a vector:
v4i32 __builtin_msa_ldrq_w (const void *, imm_n2048_2044);
v2i64 __builtin_msa_ldr_d (const void *, imm_n4096_4088);
void __builtin_msa_strq_w (v4i32, void *, imm_n2048_2044);
void __builtin_msa_str_d (v2i64, void *, imm_n4096_4088);
Differential Revision: https://reviews.llvm.org/D73644
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73885
Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code.
Reviewers: courbet
Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73785
This is passed to legalizeCustom, but not intrinsic. Also remove the
MRI argument, since you can get that from the MachineIRBuilder.
I'm not sure why MachineIRBuilder has a private observer member, and
this is passed separately.
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates
these intrinsics from __builtin_popcount and __builtin_popcountll.
Add lower and narrow scalar for G_CTPOP.
Lower G_CTPOP for MIPS32.
Differential Revision: https://reviews.llvm.org/D73216
llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef,
it tells whether zero as the first argument produces a defined result.
G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false)
intrinsics, clang generates these intrinsics from __builtin_ctz and
__builtin_ctzll.
G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true).
Clang generates such intrinsics as parts of expansion of builtin_ffs
and builtin_ffsll. It is also traditionally part of and many
algorithms that are now predicated on avoiding zero-value inputs.
Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ.
Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32.
Differential Revision: https://reviews.llvm.org/D73215
llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef,
it tells whether zero as the first argument produces a defined result.
MIPS clz instruction returns 32 for zero input.
G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false)
intrinsics, clang generates these intrinsics from __builtin_clz and
__builtin_clzll.
G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as
second argument. It is also traditionally part of and many algorithms
that are now predicated on avoiding zero-value inputs.
Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF).
Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32.
Differential Revision: https://reviews.llvm.org/D73214
Updated FoldConstantArithmetic method signature to match that of
FoldConstantVectorArithmetic in preparation for merging the two
functions together
https://bugs.llvm.org/show_bug.cgi?id=36544
This is the first step in combining the various
FoldConstantVectorArithmetic and FoldConstantVectorArithmetic
functions into one FoldConstantArithmetic function.
Differential Revision: https://reviews.llvm.org/D72870
Summary:
This is a follow up on https://reviews.llvm.org/D71473#inline-647262.
There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case.
Reviewers: xbolva00, courbet, bollu
Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D73099
Except AMDGPU/R600RegisterInfo (a bunch of MIR tests seem to have
problems), every target overrides it with true. PostMachineScheduler
requires livein information. Not providing it can cause assertion
failures in ScheduleDAGInstrs::addSchedBarrierDeps().
Summary:
For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF
this change makes all symbols in the target specific libraries hidden
by default.
A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these
libraries public, which is mainly needed for the definitions of the
LLVMInitialize* functions.
This patch reduces the number of public symbols in libLLVM.so by about
25%. This should improve load times for the dynamic library and also
make abi checker tools, like abidiff require less memory when analyzing
libLLVM.so
One side-effect of this change is that for builds with
LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that
access symbols that are no longer public will need to be statically linked.
Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1):
nm before/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
36221
nm after/libLLVM-9svn.so | grep ' [A-Zuvw] ' | wc -l
26278
Reviewers: chandlerc, beanz, mgorny, rnk, hans
Reviewed By: rnk, hans
Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D54439
The R_(MICRO)MIPS_JALR optimization only works when used against functions.
Using the relocation against a data symbol (e.g. function pointer) will
cause some linkers that don't ignore the hint in this case (e.g. LLD prior
to commit 5bab291b7b) to generate a relative branch to the data symbol
which crashes at run time. Before this patch, LLVM was erroneously emitting
these relocations against local-dynamic TLS function pointers and global
function pointers with internal visibility.
Reviewers: atanasyan, jrtc27, vstefanovic
Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D72571
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.
If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
Summary:
This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare.
This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code.
Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn
Reviewed By: efriedma
Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72536
Only PPC seems to be using it, and only checks some simple cases and
doesn't distinguish between FP. Just switch to using LLT to simplify
use from GlobalISel.
Summary:
In commit b91f239485 I updated the
MipsDelaySlotFiller to skip BUNDLE instructions.
However, in addition to not considering BUNDLE instructions for the delay
slot, we also need to ensure that the register def-use information is
updated. Not updating this information caused run-time crashes (when using
the out-of-tree CHERI backend) since later definitions could be overwritten
with earlier register values.
Reviewers: atanasyan
Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D72254
printInst prints a branch/call instruction as `b offset` (there are many
variants on various targets) instead of `b address`.
It is a convention to use address instead of offset in most external
symbolizers/disassemblers. This difference makes `llvm-objdump -d`
output unsatisfactory.
Add `uint64_t Address` to printInst(), so that it can pass the argument to
printInstruction(). `raw_ostream &OS` is moved to the last to be
consistent with other print* methods.
The next step is to pass `Address` to printInstruction() (generated by
tablegen from the instruction set description). We can gradually migrate
targets to print addresses instead of offsets.
In any case, downstream projects which don't know `Address` can pass 0 as
the argument.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D72172
AMDGPU can't unambiguously go back from the selected instruction
register class to the register bank without knowing if this was used
in a boolean context.
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.
Recommit notes:
Introduce temporary variables in order to make sure
instructions get inserted into MachineFunction in same order
regardless of compiler used to build llvm.
Differential Revision: https://reviews.llvm.org/D71363
G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics,
clang genrates these intrinsics from __builtin_bitreverse32 and
__builtin_bitreverse64.
Add lower and narrowscalar for G_BITREVERSE.
Lower G_BITREVERSE on MIPS32.
Differential Revision: https://reviews.llvm.org/D71363
G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates
these intrinsics from __builtin_bswap32 and __builtin_bswap64.
Add lower and narrowscalar for G_BSWAP.
Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later.
Differential Revision: https://reviews.llvm.org/D71362
This allows us to delete InlineAsm::Constraint_i workarounds in
SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and
TargetLowering::getInlineAsmMemConstraint overrides.
They were introduced to X86 in r237517 to prevent crashes for
constraints like "=*imr". They were later copied to other targets.
Currently -fuse-init-array option is not effective when target triple
does not specify os, on x86,x86_64.
i.e.
// -fuse-init-array is not honored.
$ clang -target i386 -fuse-init-array test.c -S
// -fuse-init-array is honored.
$ clang -target i386-linux -fuse-init-array test.c -S
This patch fixes first case.
And does cleanup.
Reviewers: rnk, craig.topper, fhahn, echristo
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D71360
Legalization algorithm is complicated by two facts:
1) While regular instructions should be possible to legalize in
an isolated, per-instruction, context-free manner, legalization
artifacts can only be eliminated in pairs, which could be deeply, and
ultimately arbitrary nested: { [ () ] }, where which paranthesis kind
depicts an artifact kind, like extend, unmerge, etc. Such structure
can only be fully eliminated by simple local combines if they are
attempted in a particular order (inside out), or alternatively by
repeated scans each eliminating only one innermost pair, resulting in
O(n^2) complexity.
2) Some artifacts might in fact be regular instructions that could (and
sometimes should) be legalized by the target-specific rules. Which
means failure to eliminate all artifacts on the first iteration is
not a failure, they need to be tried as instructions, which may
produce more artifacts, including the ones that are in fact regular
instructions, resulting in a non-constant number of iterations
required to finish the process.
I trust the recently introduced termination condition (no new artifacts
were created during as-a-regular-instruction-retrial of artifacts not
eliminated on the previous iteration) to be efficient in providing
termination, but only performing the legalization in full if and only if
at each step such chains of artifacts are successfully eliminated in
full as well.
Which is currently not guaranteed, as the artifact combines are applied
only once and in an arbitrary order that has to do with the order of
creation or insertion of artifacts into their worklist, which is a no
particular order.
In this patch I make a small change to the artifact combiner, making it
to re-insert into the worklist immediate (modulo a look-through copies)
artifact users of each vreg that changes its definition due to an
artifact combine.
Here the first scan through the artifacts worklist, while not
being done in any guaranteed order, only needs to find the innermost
pair(s) of artifacts that could be immediately combined out. After that
the process follows def-use chains, making them shorter at each step, thus
combining everything that can be combined in O(n) time.
Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders
Reviewed By: aditya_nandakumar, paquette
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71448
Summary:
The use of a boolean isInteger flag (generally initialized using
VT.isInteger()) caused errors in our out-of-tree CHERI backend
(https://github.com/CTSRD-CHERI/llvm-project).
In our backend, pointers use a separate ValueType (iFATPTR) and therefore
.isInteger() returns false. This meant that getSetCCInverse() was using the
floating-point variant and generated incorrect code for us:
`(void *)0x12033091e < (void *)0xffffffffffffffff` would return false.
Committing this change will significantly reduce our merge conflicts
for each upstream merge.
Reviewers: spatel, bogner
Reviewed By: bogner
Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70917
In order to properly implement these atomic we need one register more than other
binary atomics. It is used for storing result from comparing values in addition
to the one that is used for actual result of operation.
https://reviews.llvm.org/D71028
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
object file size.
- Incremental step towards decoupling target intrinsics.
The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.
Part of PR34259
Reviewers: efriedma, echristo, MaskRay
Reviewed By: echristo, MaskRay
Differential Revision: https://reviews.llvm.org/D71320
This patch adds forward iterators mc_difflist_iterator,
mc_subreg_iterator and mc_superreg_iterator, based on the existing
DiffListIterator. Those are used to provide iterator ranges over
sub- and super-register from TRI, which are slightly more convenient
than the existing MCSubRegIterator/MCSuperRegIterator. Unfortunately,
it duplicates a bit of functionality, but the new iterators are a bit
more convenient (and can be used with various existing iterator
utilities) and should probably replace the old iterators in the future.
This patch updates some existing users.
Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D70565
Summary:
We rely on this in our CHERI backend to address the GOT by generating a
$pc-relative addresses. For this we emit the following code sequence:
lui $1, %pcrel_hi(_CHERI_CAPABILITY_TABLE_-8)
daddiu $1, $1, %pcrel_lo(_CHERI_CAPABILITY_TABLE_-4)
cgetpccincoffset $c1, $1
However, without this change the addend is implicitly converted to
UINT32_MAX and an invalid pointer value is generated.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70953
Summary:
In our CHERI fork we use BUNDLE instructions to ensure that a
three-instruction sequence to generate a program-counter-relative value is
emitted without reordering or insertions (since that would break the 32-bit
offset computation).
Currently MipsAsmPrinter asserts when it encounters a pseudo instruction.
To handle BUNDLE we can simply skip the instruction which will then make
EmitInstruction() process the contents of the bundle in order.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70945
Summary:
In our CHERI fork we use BUNDLE instructions to ensure that a
three-instruction sequence to generate a program-counter-relative value is
emitted without reordering or insertions (since that would break the 32-bit
offset computation). This sequence is created in MipsExpandPseudo and we use
finalizeBundle() to create the BUNDLE instruction.
However, the delay slot filler currently breaks this pattern since the BUNDLE
will be removed and so all instructions are moved into the delay slot.
Since the delay slot only executes the first instruction, this results in
incorrect computations (and run-time crashes) if the branch is taken.
The original test cases uses CHERI instructions, so for the test case here
I simple filled a BUNDLE with a no-op DADDiu $sp_64, -16 and DADDiu $sp_64, 16.
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70944
Summary:
I was tracking down a code-generation bug in this pass and found that the
added output was useful. It is also helpful to find out why a delay slot
could not be filled even though there is clearly a valid instruction (which
appears to mostly be caused by CFI instructions).
Reviewers: atanasyan
Reviewed By: atanasyan
Subscribers: merge_guards_bot, sdardis, hiraditya, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70940
There are a couple of bugs with the sc, scs, ll, lld instructions expanding:
1. On R6 these instruction pack immediate offset into a 9-bit field. Now
if an immediate exceeds 9-bits assembler does not perform expansion and
just rejects such instruction.
2. On 64-bit non-PIC code if an operand is a symbol assembler generates
incorrect sequence of instructions. It uses R_MIPS_HI16 and R_MIPS_LO16
relocations and skips R_MIPS_HIGHEST and R_MIPS_HIGHER ones.
To solve these problems this patch:
- Introduces `mem_simm9_exp` to mark 9-bit memory immediate operands
which require expansion. Probably later all `mem_simm9` operands will be
able to migrate on `mem_simm9_exp` and we rename it to `mem_simm9`.
- Adds new `OPERAND_MEM_SIMM9` operand type and assigns it to the
`mem_simm9_exp`. That allows to know operand size in the `processInstruction`
method and decide whether we need to expand instruction.
- Adds `expandMem9Inst` method to expand instructions with 9-bit memory
immediate operand. This method just load immediate into a "base"
register used by origibal instruction:
sc $2, 256($sp) => addiu $1, $sp, 256
sc $2, 0($1)
- Fix `expandMem16Inst` to support a correct set of relocations for
symbol loading in case of 64-bit non-PIC code.
ll $12, symbol => lui $12, 0
R_MIPS_HIGHEST symbol
daddiu $12, $12, 0
R_MIPS_HIGHER symbol
dsll $12, $12, 16
daddiu $12, $12, 0
R_MIPS_HI16 symbol
dsll $12, $12, 16
ll $12, 0($12)
R_MIPS_LO16 symbol
- Fix `expandMem16Inst` to unify handling of 3 and 4 operands
instructions.
- Delete unused now `MipsTargetStreamer::emitSCWithSymOffset` method.
Task for next patches - implement expanding for other instructions use
`mem_simm9` operand and other `mem_simm##` operands.
Differential Revision: https://reviews.llvm.org/D70648
Summary:
Most libraries are defined in the lib/ directory but there are also a
few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining
"Component Libraries" as libraries defined in lib/ that may be included in
libLLVM.so. Explicitly marking the libraries in lib/ as component
libraries allows us to remove some fragile checks that attempt to
differentiate between lib/ libraries and tools/ libraires:
1. In tools/llvm-shlib, because
llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of
all libraries defined in the whole project, there was custom code
needed to filter out libraries defined in tools/, none of which should
be included in libLLVM.so. This code assumed that any library
defined as static was from lib/ and everything else should be
excluded.
With this change, llvm_map_components_to_libnames(LIB_NAMES, "all")
only returns libraries that have been added to the LLVM_COMPONENT_LIBS
global cmake property, so this custom filtering logic can be removed.
Doing this also fixes the build with BUILD_SHARED_LIBS=ON
and LLVM_BUILD_LLVM_DYLIB=ON.
2. There was some code in llvm_add_library that assumed that
libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or
ARG_LINK_COMPONENTS set. This is only true because libraries
defined lib lib/ use LLVMBuild.txt and don't set these values.
This code has been fixed now to check if the library has been
explicitly marked as a component library, which should now make it
easier to remove LLVMBuild at some point in the future.
I have tested this patch on Windows, MacOS and Linux with release builds
and the following combinations of CMake options:
- "" (No options)
- -DLLVM_BUILD_LLVM_DYLIB=ON
- -DLLVM_LINK_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON
- -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON
Reviewers: beanz, smeenai, compnerd, phosek
Reviewed By: beanz
Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70179
Having a generic CPU removes a warning when creating a subtarget without
the CPU being explicitly specified.
Differential Revision: https://reviews.llvm.org/D70490
`expandMemInst` expects instruction with 3 or 4 operands and the last
operand requires expanding. It's redundant to scan all operands in a
loop. We can check the last operands.
* Implements scalable size queries for MVTs, split out from D53137.
* Contains a fix for FindMemType to avoid using scalable vector type
to contain non-scalable types.
* Explicit casts for several places where implicit integer sign
changes or promotion from 32 to 64 bits caused problems.
* CodeGenDAGPatterns will treat scalable and non-scalable vector types
as different.
Reviewers: greened, cameron.mcinally, sdesmalen, rovka
Reviewed By: rovka
Differential Revision: https://reviews.llvm.org/D66871
This patch makes LLVM compatible with GAS. It accepts `la` pseudo
instruction on arch with 64-bit pointers and just shows a warning.
Differential Revision: https://reviews.llvm.org/D70202
O32 ABI uses relocations in REL format. Relocation's addend is written
in place. R_MIPS_JALR relocation points to the `jalr` instruction which
does not have a place to store the relocation addend. So it's impossible
to save non-zero "offset". This patch blocks emission of `R_MIPS_JALR`
relocations in such cases.
Differential Revision: https://reviews.llvm.org/D70201
Introduce IntImmLeaf version of PatLeaf immZExt16 for 32-bit immediates.
Change immZExt16 with imm32ZExt16 for andi, ori and xori.
This keeps same behavior for SDAG and allows for GlobalISel selectImpl
to select 'G_CONSTANT imm' + G_AND, G_OR, G_XOR into ANDi, ORi, XORi,
respectively, when 32-bit imm satisfies imm32ZExt16 predicate: zero
extending 16 low bits of imm is equal to imm.
Large number of test changes comes from zero extending of small types
which is transformed into 'and' with bitmask in legalizer.
Differential Revision:https://reviews.llvm.org/D70185
Introduce IntImmLeaf version of PatLeaf immSExt16 for 32-bit immediates.
Change immSExt16 with imm32SExt16 for addiu.
This keeps same behavior for SDAG and allows for GlobalISel selectImpl
to select 'G_CONSTANT imm' + G_ADD into ADDIu when 32-bit imm satisfies
imm32SExt16 predicate: sign extending 16 low bits of imm is equal to imm.
Differential Revision: https://reviews.llvm.org/D70184
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Instruction ldi.fmt can be considered cheap enough to avoid spill and restore
of value that it produces since it's loaded from immediate.
Differential Revision: https://reviews.llvm.org/D69898
When a 64-bit triple is used emit an error if the CPU only supports
32-bit code.
Patch by Miloš Stojanović.
Differential Revision: https://reviews.llvm.org/D70018
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D69622
`saa` and `saad` are 32-bit and 64-bit store atomic add instructions.
memory[base] = memory[base] + rt
These instructions are available for "Octeon+" CPU. The patch adds support
for both instructions to MIPS assembler and diassembler and introduces new
CPU type - "octeon+".
Next patches will implement `.set arch=octeon+` directive and `AFL_EXT_OCTEONP`
ISA extension flag support.
Differential Revision: https://reviews.llvm.org/D69849
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
Introduce helper methods and refactor pieces of code related to
register banks in MipsInstructionSelector.
Add a few detailed asserts in order to get a better overview
of LLT, register bank combinations that are supported at the moment
and reduce need to look at other files.
Differential Revision: https://reviews.llvm.org/D69663
Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods
to return optional machine operand pair of destination and source
registers.
Patch by Nikola Prica
Differential Revision: https://reviews.llvm.org/D69622
selectImpl is able to select G_FSQRT when we set bank for vector
operands to fprb. Add detailed tests.
Note: G_FSQRT is generated from llvm-ir intrinsics llvm.sqrt.*,
and at the moment MIPS is not able to generate this intrinsic for
vector type (some targets generate vector llvm.sqrt.* from calls
to a builtin function).
__builtin_msa_fsqrt_<format> will be transformed into G_FSQRT
in legalizeIntrinsic and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69376
selectImpl is able to select G_FABS when we set bank for vector
operands to fprb. Add detailed tests.
Note: G_FABS is generated from llvm-ir intrinsics llvm.fabs.*,
and at the moment MIPS is not able to generate this intrinsic for
vector type (some targets generate vector llvm.fabs.* from calls
to a builtin function).
We can handle fabs using __builtin_msa_fmax_a_<format> and passing
same vector as both arguments. __builtin_msa_fmax_a_<format> will
be directly selected into FMAX_A_<format> in legalizeIntrinsic.
Differential Revision: https://reviews.llvm.org/D69346
Select vector G_FADD, G_FSUB, G_FMUL and G_FDIV for MIPS32 with MSA. We
have to set bank for vector operands to fprb and selectImpl will do the
rest. __builtin_msa_fadd_<format>, __builtin_msa_fsub_<format>,
__builtin_msa_fmul_<format> and __builtin_msa_fdiv_<format> will be
transformed into G_FADD, G_FSUB, G_FMUL and G_FDIV in legalizeIntrinsic
respectively and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69340
Select vector G_SDIV, G_SREM, G_UDIV and G_UREM for MIPS32 with MSA. We
have to set bank for vector operands to fprb and selectImpl will do the
rest. __builtin_msa_div_s_<format>, __builtin_msa_mod_s_<format>,
__builtin_msa_div_u_<format> and __builtin_msa_mod_u_<format> will be
transformed into G_SDIV, G_SREM, G_UDIV and G_UREM in legalizeIntrinsic
respectively and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69333
MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64
regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo
we can find out Mips ABI and pick appropriate prefix.
Tags: #llvm, #clang, #lldb
Differential Revision: https://reviews.llvm.org/D66795
Select vector G_MUL for MIPS32 with MSA. We have to set bank
for vector operands to fprb and selectImpl will do the rest.
Manual selection of G_MUL is now done for gprb only.
__builtin_msa_mulv_<format> will be transformed into G_MUL
in legalizeIntrinsic and selected in the same way.
Differential Revision: https://reviews.llvm.org/D69310
Select vector G_SUB for MIPS32 with MSA. We have to set bank
for vector operands to fprb and selectImpl will do the rest.
__builtin_msa_subv_<format> will be transformed into G_SUB
in legalizeIntrinsic and selected in the same way.
__builtin_msa_subvi_<format> will be directly selected into
SUBVI_<format> in legalizeIntrinsic.
Differential Revision: https://reviews.llvm.org/D69306
Select vector G_ADD for MIPS32 with MSA. We have to set bank
for vector operands to fprb and selectImpl will do the rest.
__builtin_msa_addv_<format> will be transformed into G_ADD
in legalizeIntrinsic and selected in the same way.
__builtin_msa_addvi_<format> will be directly selected into
ADDVI_<format> in legalizeIntrinsic. MIR tests for it have
unnecessary additional copies. Capture current state of tests
with run-pass=legalizer with a test in test/CodeGen/MIR/Mips.
Differential Revision: https://reviews.llvm.org/D68984
llvm-svn: 375501
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, dschuff, jyknight, sdardis, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69216
llvm-svn: 375398
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68993
llvm-svn: 375084
Add vector MSA register classes to fprb, they are 128 bit wide.
MSA instructions use the same registers for both integer and floating
point operations. Therefore we only need to check for vector element
size during legalization or instruction selection.
Add helper function in MipsLegalizerInfo and switch to legalIf
LegalizeRuleSet to keep legalization rules compact since they depend
on MipsSubtarget and presence of MSA.
fprb is assigned to all vector operands.
Move selectLoadStoreOpCode to MipsInstructionSelector in order to
reduce number of arguments.
Differential Revision: https://reviews.llvm.org/D68867
llvm-svn: 374872
Check if size of operand LLT matches sizes of available register banks
before inspecting the opcode in order to reduce number of checks.
Factor commonly used pieces of code into functions.
Differential Revision: https://reviews.llvm.org/D68866
llvm-svn: 374870
Now assembler generates two consecutive `.4byte` directives to store
64-bit `li.d' operand. The first directive stores high 4-byte of the
value. The second directive stores low 4-byte of the value. But on
64-bit system we load this value at once and get wrong result if the
system is little-endian.
This patch fixes the bug. It stores the `li.d' operand as a single
8-byte value.
Differential Revision: https://reviews.llvm.org/D68778
llvm-svn: 374598
If `li.s` or `li.d` loads zero into a FPR, it's not necessary to load
zero into `at` GPR register and then move its value into a floating
point register. We can use as a source register the `zero / $0` one.
Differential Revision: https://reviews.llvm.org/D68777
llvm-svn: 374597
The target does just enough to be able to run llvm-exegesis in latency
mode for at least some opcodes.
Patch by Miloš Stojanović.
Differential Revision: https://reviews.llvm.org/D68649
llvm-svn: 374590
If a "double" (64-bit) value has zero low 32-bits, it's possible to load
such value into a GP/FP registers as an instruction immediate. But now
assembler loads only high 32-bits of the value.
For example, if a target register is GPR the `li.d $4, 1.0` instruction
converts into the `lui $4, 16368` one. As a result, we get `0x3FF00000`
in the register. While a correct representation of the `1.0` value is
`0x3FF0000000000000`. The patch fixes that.
Differential Revision: https://reviews.llvm.org/D68776
llvm-svn: 374544
EXPENSIVE_CHECKS build was failing on new test.
This is fixed by marking $ra register as undef.
Test now has -verify-machineinstrs to check for operand flags.
llvm-svn: 374320
The `expandLoadImmReal` handles four different and almost non-overlapping
cases: loading a "single" float immediate into a GPR, loading a "single"
float immediate into a FPR, and the same couple for a "double" float
immediate.
It's better to move each `else if` branch into separate methods.
llvm-svn: 374164
When -pg option is present than a call to _mcount is inserted into every
function. However since the proper ABI was not followed then the generated
gmon.out did not give proper results. By inserting needed instructions
before every _mcount we can fix this.
Differential Revision: https://reviews.llvm.org/D68390
llvm-svn: 374055
This ensures that frame-based unwinding will continue to work when
calling a noreturn function; there is not much use having the caller's
frame pointer saved if you don't also have the caller's program counter.
Patch by James Clarke.
Differential Revision: https://reviews.llvm.org/D68542
llvm-svn: 373907
J/JAL/JALX/JALS are absolute branches, but stay within the current
256 MB-aligned region, so we must include the high bits of the
instruction address when calculating the branch target.
Patch by James Clarke.
Differential Revision: https://reviews.llvm.org/D68548
llvm-svn: 373906
Replace with the MachineFunction. X86 is the only user, and only uses
it for the function. This removes one obstacle from using this in
GlobalISel. The other is the more tolerable EVT argument.
The X86 use of the function seems questionable to me. It checks hasFP,
before frame lowering.
llvm-svn: 373292
Implement aggregate structure split to simpler types in splitToValueTypes.
splitToValueTypes is used for return values.
According to MipsABIInfo from clang/lib/CodeGen/TargetInfo.cpp,
aggregate structure arguments for O32 always get simplified and thus
will remain unsupported by the MIPS GlobalISel for the time being.
For O32, aggregate structures can be encountered only for complex number
returns e.g. 'complex float' or 'complex double' from <complex.h>.
Differential Revision: https://reviews.llvm.org/D67963
llvm-svn: 372957
Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future.
Committed on behalf of @hvdijk (Harald van Dijk)
Differential Revision: https://reviews.llvm.org/D66138
llvm-svn: 372882
CC_Mips doesn't accept vararg functions for O32, so we have to explicitly
use CC_Mips_FixedArg.
For lowerCall we now properly figure out whether callee function is vararg
or not, this has no effect for O32 since we always use CC_Mips_FixedArg.
For lower formal arguments we need to copy arguments in register to stack
and save pointer to start for argument list into MipsMachineFunction
object so that G_VASTART could use it during instruction select.
For vacopy we need to copy content from one vreg to another,
load and store are used for that purpose.
Differential Revision: https://reviews.llvm.org/D67756
llvm-svn: 372555
The static analyzer is warning about potential null dereferences, but we should be able to use cast<> directly and if not assert will fire for us.
llvm-svn: 372500
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
Summary:
As discussed in https://reviews.llvm.org/D62341#1515637,
for MIPS `add %x, -1` isn't optimal. Unlike X86 there
are no fastpaths to matearialize such `-1`/`1` vector constants,
and `sub %x, 1` results in better codegen,
so undo canonicalization
Reviewers: atanasyan, Petar.Avramovic, RKSimon
Reviewed By: atanasyan
Subscribers: sdardis, arichardson, hiraditya, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66805
llvm-svn: 372254
In case of using 32-bit GOT access to the table requires two instructions
with attached %got_hi and %got_lo relocations. This patch implements
correct expansion of 'lw/sw' instructions in that case.
Differential Revision: https://reviews.llvm.org/D67705
llvm-svn: 372251
We need "xgot" flag in the MipsAsmParser to implement correct expansion
of some pseudo instructions in case of using 32-bit GOT (XGOT).
MipsAsmParser does not have reference to MipsSubtarget but has a
reference to "feature bit set".
llvm-svn: 372220
* Reordered MVT simple types to group scalable vector types
together.
* New range functions in MachineValueType.h to only iterate over
the fixed-length int/fp vector types.
* Stopped backends which don't support scalable vector types from
iterating over scalable types.
Reviewers: sdesmalen, greened
Reviewed By: greened
Differential Revision: https://reviews.llvm.org/D66339
llvm-svn: 372099
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson
Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D67499
llvm-svn: 371742
IRTranslator creates G_DYN_STACKALLOC instruction during expansion of
alloca when argument that tells number of elements to allocate on stack
is a virtual register. Use default lowering for MIPS32.
Differential Revision: https://reviews.llvm.org/D67440
llvm-svn: 371728
G_IMPLICIT_DEF is used for both integer and floating point implicit-def.
Handle G_IMPLICIT_DEF as ambiguous opcode in MipsRegisterBankInfo.
Select G_IMPLICIT_DEF for MIPS32.
Differential Revision: https://reviews.llvm.org/D67439
llvm-svn: 371727
Summary:
This catches malformed mir files which specify alignment as log2 instead of pow2.
See https://reviews.llvm.org/D65945 for reference,
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67433
llvm-svn: 371608
When value of immediate in `mips.nori.b` is 255 (which has all ones in
binary form as 8bit integer) DAGCombiner and Legalizer would fall in an
infinite loop. DAGCombiner would try to simplify `or %value, -1` by
turning `%value` into UNDEF. Legalizer will turn it back into `Constant<0>`
which would then be again turned into UNDEF by DAGCombiner. To avoid this
loop we make UNDEF legal for MSA int types on Mips.
Patch by Mirko Brkusanin.
Differential Revision: https://reviews.llvm.org/D67280
llvm-svn: 371607
microMIPS jump and link exchange instruction stores a target in a
26-bits field. Despite other microMIPS JAL instructions these bits
are target address shifted right 2 bits [1]. The patch fixes the
JALX instruction decoding and uses 2-bit shift.
[1] MIPS Architecture for Programmers Volume II-B: The microMIPS32 Instruction Set
Differential Revision: https://reviews.llvm.org/D67320
llvm-svn: 371428
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: jyknight, sdardis, nemanjai, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, s.egerton, pzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67229
llvm-svn: 371200
G_FENCE comes form fence instruction. For MIPS fence is generated in
AtomicExpandPass when atomic instruction gets surrounded with fence
instruction when needed.
G_FENCE arguments don't have LLT, because of that there is no job for
legalizer and regbankselect. Instruction select G_FENCE for MIPS32.
Differential Revision: https://reviews.llvm.org/D67181
llvm-svn: 371056
Select G_INTRINSIC_W_SIDE_EFFECTS for Intrinsic::trap for MIPS32
via legalizeIntrinsic.
Differential Revision: https://reviews.llvm.org/D67180
llvm-svn: 371055
Instead of returning structure by value clang usually adds pointer
to that structure as an argument. Pointers don't require special
handling no matter the SRet flag. Remove unsuccessful exit from
lowerCall for arguments with SRet flag if they are pointers.
Differential Revision: https://reviews.llvm.org/D67179
llvm-svn: 371054
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:
- `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
- `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
- `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,
Reviewers: lattner, thegameg, courbet
Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65945
llvm-svn: 371045
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.
Support for splitting for return types not yet implemented.
Differential Revision: https://reviews.llvm.org/D66180
llvm-svn: 370822
Now the last `.section` directive in the MIPS asm file preamble
is the `.section .mdebug.abi`. If assembler code injected for example
by the LLVM `module asm` or the C ` __asm` directives do not contain
explicit switching to the `.text` section it goes to the `.mdebug.abi`
section. It might be unexpected to the user and in fact for example
breaks building some existing code like FreeBSD libc [1].
The patch forces switching to the `.text` section after emitting MIPS
assembler file preamble.
[1] https://bugs.llvm.org/show_bug.cgi?id=43119
Fix PR43119.
Differential Revision: https://reviews.llvm.org/D67014
llvm-svn: 370735
Add lower for G_FPTOUI. Algorithm is similar to the SDAG version
in TargetLowering::expandFP_TO_UINT.
Lower G_FPTOUI for MIPS32.
Differential Revision: https://reviews.llvm.org/D66929
llvm-svn: 370431
Both methods `MipsTargetStreamer::emitStoreWithSymOffset` and
`MipsTargetStreamer::emitLoadWithSymOffset` are almost the same and
differ argument names only. These methods are used in the single place
so it's better to inline their code and remove original methods.
llvm-svn: 370354
When a "base" in the `lw/sw $reg1, symbol($reg2)` instruction is
a register and generated code is position independent, backend
does not add the "base" value to the symbol address.
```
lw $reg1, %got(symbol)($gp)
lw/sw $reg1, 0($reg1)
```
This patch fixes the bug and adds the missed `addu` instruction by
passing `BaseReg` into the `loadAndAddSymbolAddress` routine and handles
the case when the `BaseReg` is the zero register to escape redundant
`move reg, reg` instruction:
```
lw $reg1, %got(symbol)($gp)
addu $reg1, $reg1, $reg2
lw/sw $reg1, 0($reg1)
```
Differential Revision: https://reviews.llvm.org/D66894
llvm-svn: 370353
If result of 64-bit address loading combines with 32-bit mask, LLVM
tries to optimize the code and remove "redundant" loading of upper
32-bits of the address. It leads to incorrect code on MIPS64 targets.
MIPS backend creates the following chain of commands to load 64-bit
address in the `MipsTargetLowering::getAddrNonPICSym64` method:
```
(add (shl (add (shl (add %highest(sym), %higher(sym)),
16),
%hi(sym)),
16),
%lo(%sym))
```
If the mask presents, LLVM decides to optimize the chain of commands. It
really does not make sense to load upper 32-bits because the 0x0fffffff
mask anyway clears them. After removing redundant commands we get this
chain:
```
(add (shl (%hi(sym), 16), %lo(%sym))
```
There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in `SYM_32`
predicate definition, backend incorrectly selects a pattern for a 32-bit
symbols and uses the `lui` instruction for loading `%hi(sym)`.
As a result we get incorrect set of instructions with unnecessary 16-bit
left shifting:
```
lui at,0x0
R_MIPS_HI16 foo
dsll at,at,0x10
daddiu at,at,0
R_MIPS_LO16 foo
```
This patch resolves two problems:
- Fix `SYM_32/SYM_64` predicates to prevent selection of patterns dedicated
to 32-bit symbols in case of using N64 ABI.
- Add missed patterns for 64-bit symbols for `%hi/%lo`.
Fix PR42736.
Differential Revision: https://reviews.llvm.org/D66228
llvm-svn: 370268
There is no pattern matched `add hi, (MipsLo texternalsym)`. As a result,
loading an address of 32-bit symbol requires two registers and one more
additional instruction:
```
addiu $1, $zero, %lo(foo)
lui $2, %hi(foo)
addu $25, $2, $1
```
This patch adds the missed pattern and enables generation more effective
set of instructions:
```
lui $1, %hi(foo)
addiu $25, $1, %lo(foo)
```
Differential Revision: https://reviews.llvm.org/D66771
llvm-svn: 370196
Now `lw/sw $reg, sym+offset` pseudo instructions for global symbol `sym`
are lowering into the following three instructions.
```
lw $reg, %got(symbol)($gp)
addiu $reg, $reg, offset
lw/sw $reg, 0($reg)
```
It's possible to reduce the number of instructions by taking the offset
in account in the final `lw/sw` command. This patch implements that
optimization.
```
lw $reg, %got(symbol)($gp)
lw/sw $reg, offset($reg)
```
Differential Revision: https://reviews.llvm.org/D66553
llvm-svn: 369756
Now pseudo instruction `la $6, symbol+8($6)` is expanding into the following
chain of commands:
```
lw $1, %got(symbol+8)($gp)
addiu $1, $1, 8
addu $6, $1, $6
```
This is incorrect. When a linker handles the `R_MIPS_GOT16` relocation,
it does not expect to get any addend and breaks on assertion. Otherwise
it has to create new GOT entry for each unique "sym + offset" pair.
Offset for a global symbol should be added to result of loading GOT
entry by a separate `add` command.
The patch fixes the problem by stripping off an offset from the expression
passed to the `%got`. That's interesting that even current code inserts
a separate `add` command.
Differential Revision: https://reviews.llvm.org/D66552
llvm-svn: 369755
Prefer `MCFixupKind` where possible and add getTargetKind() to
convert to `unsigned` when needed rather than scattering cast
operators around the place.
Differential Revision: https://reviews.llvm.org/D59890
llvm-svn: 369720
In case of expanding `lw/sw $reg, symbol($reg)` instruction for PIC it's
enough to call the `loadAndAddSymbolAddress` method. Additional work
performed by the `expandLoadAddress` is not required here.
llvm-svn: 369563
r351882 allows different type for shift amount then result and value
being shifted. Fix MIPS Legalizer rules to take r351882 into account.
Differential Revision: https://reviews.llvm.org/D66203
llvm-svn: 369510
Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source.
NarrowScalar G_TRUNC to s32 for MIPS32.
Differential Revision: https://reviews.llvm.org/D66202
llvm-svn: 369509
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).
Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor
Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&
Depends on D65919
Reviewers: arsenm, bogner, craig.topper, RKSimon
Reviewed By: arsenm
Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65962
llvm-svn: 369041
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.
llvm-svn: 369013
Currently we can't keep any state in the selector object that we get from
subtarget. As a result we have to plumb through all our variables through
multiple functions. This change makes it non-const and adds a virtual init()
method to allow further state to be captured for each target.
AArch64 makes use of this in this patch to cache a call to hasFnAttribute()
which is expensive to call, and is used on each selection of G_BRCOND.
Differential Revision: https://reviews.llvm.org/D65984
llvm-svn: 368652
Summary:
Targets often have instructions that can sign-extend certain cases faster
than the equivalent shift-left/arithmetic-shift-right. Such cases can be
identified by matching a shift-left/shift-right pair but there are some
issues with this in the context of combines. For example, suppose you can
sign-extend 8-bit up to 32-bit with a target extend instruction.
%1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity)
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_ASHR %2:_(s32), i32 1
would reasonably combine to:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 25
which no longer matches the special case. If your shifts and extend are
equal cost, this would break even as a pair of shifts but if your shift is
more expensive than the extend then it's cheaper as:
%2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8
%3:_(s32) = G_ASHR %2:_(s32), i32 1
It's possible to match the shift-pair in ISel and emit an extend and ashr.
However, this is far from the only way to break this shift pair and make
it hard to match the extends. Another example is that with the right
known-zeros, this:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 24
%3:_(s32) = G_MUL %2:_(s32), i32 2
can become:
%1:_(s32) = G_SHL %0:_(s32), i32 24
%2:_(s32) = G_ASHR %1:_(s32), i32 23
All upstream targets have been configured to lower it to the current
G_SHL,G_ASHR pair but will likely want to make it legal in some cases to
handle their faster cases.
To follow-up: Provide a way to legalize based on the constant. At the
moment, I'm thinking that the best way to achieve this is to provide the
MI in LegalityQuery but that opens the door to breaking core principles
of the legalizer (legality is not context sensitive). That said, it's
worth noting that looking at other instructions and acting on that
information doesn't violate this principle in itself. It's only a
violation if, at the end of legalization, a pass that checks legality
without being able to see the context would say an instruction might not be
legal. That's a fairly subtle distinction so to give a concrete example,
saying %2 in:
%1 = G_CONSTANT 16
%2 = G_SEXT_INREG %0, %1
is legal is in violation of that principle if the legality of %2 depends
on %1 being constant and/or being 16. However, legalizing to either:
%2 = G_SEXT_INREG %0, 16
or:
%1 = G_CONSTANT 16
%2:_(s32) = G_SHL %0, %1
%3:_(s32) = G_ASHR %2, %1
depending on whether %1 is constant and 16 does not violate that principle
since both outputs are genuinely legal.
Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61289
llvm-svn: 368487
Fast-isel was picking AFGR64 register class for processing call
arguments when +fp64 options was used. We simply check is option +fp64
is used and pick appropriate register.
Patch by Mirko Brkusanin.
Differential Revision: https://reviews.llvm.org/D65886
llvm-svn: 368433
I've now needed to add an extra parameter to this call twice recently. Not only
is the signature getting extremely unwieldy, but just updating all of the
callsites and implementations is a pain. Putting the parameters in a struct
sidesteps both issues.
llvm-svn: 368408
G_JUMP_TABLE and G_BRJT appear from translation of switch statement.
Select these two instructions for MIPS32, both pic and non-pic.
Differential Revision: https://reviews.llvm.org/D65861
llvm-svn: 368274
Function MipsAsmParser::expandMemInst() did not properly handle
instruction `sc` with a symbol as an argument because first argument
would be counted twice. We add additional checks and handle this case
separately.
Patch by Mirko Brkusanin.
Differential Revision: https://reviews.llvm.org/D64252
llvm-svn: 368160
If an operand of the `lw/sw` instructions is a symbol, these instructions
incorrectly lowered using not-position-independent chain of commands.
For PIC code we should use `lw/addiu` instructions with the `R_MIPS_GOT16`
and `R_MIPS_LO16` relocations respectively. Instead of that LLVM generates
position dependent code with the `R_MIPS_HI16` and `R_MIPS_LO16`
relocations.
This patch provides a fix for the bug by handling PIC case separately in
the `MipsAsmParser::expandMemInst`. The main idea is to generate a chain
of PIC instructions to load a symbol address into a register and then
load the address content.
The fix is not optimal and does not fix all PIC-related problems. This
is a task for subsequent patches.
Differential Revision: https://reviews.llvm.org/D65524
llvm-svn: 367580
Fold load/store + G_GEP + G_CONSTANT when
immediate in G_CONSTANT fits into 16 bit signed integer.
Differential Revision: https://reviews.llvm.org/D65507
llvm-svn: 367535
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.
NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.
Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette
Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65488
llvm-svn: 367476