Commit Graph

3581 Commits

Author SHA1 Message Date
Matt Arsenault 8bc03d2168 GlobalISel: Merge G_PTR_MASK with llvm.ptrmask intrinsic
Confusingly, these were unrelated and had different semantics. The
G_PTR_MASK instruction predates the llvm.ptrmask intrinsic, but has a
different format. G_PTR_MASK only allows clearing the low bits of a
pointer, and only a constant number of bits. The ptrmask intrinsic
allows an arbitrary mask. Replace G_PTR_MASK to match the intrinsic.

Only selects the cases that look like the old instruction. More work
is needed to select the general case. Also new legalization code is
still needed to deal with the case where the incoming mask size does
not match the pointer size, which has a specified behavior in the
langref.
2020-05-26 11:48:13 -04:00
Fangrui Song 872c5fb143 [AsmPrinter] Don't generate .Lfoo$local for -fno-PIC and -fPIE
-fno-PIC and -fPIE code generally cannot be linked in -shared mode and there is no benefit accessing via local aliases.

Actually, a .Lfoo$local reference will be converted to a STT_SECTION (if no section relaxation) reference which will cause the section symbol (sizeof(Elf64_Sym)=24) to be generated.
2020-05-25 23:35:49 -07:00
Simon Pilgrim 9fa58d1bf2 [DAG] Add SimplifyDemandedVectorElts binop SimplifyMultipleUseDemandedBits handling
For the supported binops (basic arithmetic, logicals + shifts), if we fail to simplify the demanded vector elts, then call SimplifyMultipleUseDemandedBits and try to peek through ops to remove unnecessary dependencies.

This helps with PR40502.

Differential Revision: https://reviews.llvm.org/D79003
2020-05-25 12:41:22 +01:00
Amara Emerson 99660217e9 [AArch64][GlobalISel] When generating SUBS for compares, don't write to wzr/xzr.
Although writing to wzr/xzr is correct since we don't care about the result
of the sub, only the flags, doing so causes tail merge blocks to fail.

Writing to an unused virtual register instead allows the optimization to fire,
improving performance significantly on 256.bzip2.

Differential Revision: https://reviews.llvm.org/D80460
2020-05-23 22:59:49 -07:00
Nikita Popov 2833c46f75 [DwarfEHPrepare] Don't prune unreachable resumes at optnone
Disable pruning of unreachable resumes in the DwarfEHPrepare pass
at optnone. While I expect the pruning itself to be essentially free,
this does require a dominator tree calculation, that is not used for
anything else. Saving this DT construction makes for a 0.4% O0
compile-time improvement.

Differential Revision: https://reviews.llvm.org/D80400
2020-05-23 20:58:01 +02:00
Nikita Popov 0c6bba71e3 [TargetPassConfig] Don't add alias analysis at optnone
When performing codegen at optnone, don't add alias analysis to
the pipeline. We don't need it, but it causes an unnecessary
dominator tree calculation.

I've also moved the module verifier call to the top so that a bunch
of disabled-at-optnone passes group more nicely.

Differential Revision: https://reviews.llvm.org/D80378
2020-05-23 10:35:03 +02:00
Jean-Michel Gorius 65cd2c7a80 Revert "[CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias"
This temporarily reverts commit 7019cea26d.

It seems that, for some targets, there are instructions with a lot of memory operands (probably more than would be expected). This causes a lot of buildbots to timeout and notify failed builds. While investigations are ongoing to find out why this happens, revert the changes.
2020-05-22 21:26:46 +02:00
Jessica Paquette 49a4f3f7d8 [AArch64][GlobalISel] Add a post-legalizer combiner with a very simple combine.
(This patch is by Jessica, I'm just committing it on her behalf because I need
a post-legalizer combiner for something else).

This supersedes D77250, which did equivalent work in the selector. This can be
done pre-legalization or post-legalization. Post-legalization is more likely to
hit, since G_IMPLICIT_DEFs tend to appear during legalization. There's no reason
to not do it pre-legalization though-- if it can be caught earlier, great.

(I also think that it might be worth reimplementing D78769 using a
target-specific post-legalization combine too after thinking about it for a
while.)

Differential Revision: https://reviews.llvm.org/D78852
2020-05-21 18:47:32 -07:00
Alexey Lapshin bf242c067e [AARCH64][NEON] Allow to sink operands of aarch64_neon_pmull64.
Summary:
This patch fixes a problem when pmull2 instruction is not
generated for vmull_high_p64 intrinsic.

ISel has a pattern for int_aarch64_neon_pmull64 intrinsic to generate
PMULL2 instruction. That pattern assumes that extraction operations
are located in the same basic block. We need to sink them
if they are not. Handle operands of int_aarch64_neon_pmull64
into AArch64TargetLowering::shouldSinkOperands.

Reviewed by: efriedma

Differential Revision: https://reviews.llvm.org/D80320
2020-05-22 01:35:24 +03:00
Jean-Michel Gorius 7019cea26d [CodeGen] Add support for multiple memory operands in MachineInstr::mayAlias
Summary:
To support all targets, the mayAlias member function needs to support instructions with multiple operands.

This revision also changes the order of the emitted instructions in some test cases.

Reviewers: efriedma, hfinkel, craig.topper, dmgreen

Reviewed By: efriedma

Subscribers: MatzeB, dmgreen, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80161
2020-05-21 23:02:54 +02:00
Eli Friedman f09d220c71 [AArch64][SVE] Fill out missing unpredicated load/store patterns.
The set of patterns for unpredicated load/store was incomplete: it only
included non-extending stores.  Fill out the remaining patterns for
extending stores, and add the corresponding support to frame offset
lowering.

Differential Revision: https://reviews.llvm.org/D80349
2020-05-21 13:29:30 -07:00
Jon Roelofs 5fb979dd06 [llvm][test] Add missing FileCheck colons. NFC 2020-05-21 09:29:27 -06:00
Eli Friedman b4f9b34701 [AArch64] Fix unwind info generated by outliner.
The offsets were wrong. The result is now the same as what the compiler
would generate for a function that spills lr normally.

Differential Revision: https://reviews.llvm.org/D80238
2020-05-20 16:39:00 -07:00
Francis Visoiu Mistrih 770ba4f051 [AArch64] Fix GlobalISel tests on non-darwin platforms
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/6998
2020-05-20 16:26:58 -07:00
Francis Visoiu Mistrih 161122ea1c [AArch64] Provide Darwin variants of most calling conventions
With the new SVE stack layout, we now need to provide a Darwin variant
for all the calling conventions based on the main AAPCS CSR save order.

This also changes APCS_SwiftError to have a Darwin and a non-Darwin
version, assuming it could be used on other platforms these days, and
restricts the AArch64_CXX_TLS calling convention to Darwin.

Differential Revision: https://reviews.llvm.org/D73805
2020-05-20 16:03:48 -07:00
Cameron McInally e89a08aefd [SVE] MOVPRFX zero merging test renaming
Differential Revision: https://reviews.llvm.org/D80244
2020-05-19 17:33:19 -05:00
Matt Arsenault 08ae945318 GlobalISel: Copy correct flags to select
This was looking for a compare condition, and copying the compare
flags. I don't think this was ever correct outside of certain min/max
patterns which aren't checked, but this probably predates select
instructions having fast math flags.
2020-05-19 18:31:24 -04:00
Eli Friedman 5d2c3a0b8c [AArch64] Disable MachineOutliner on Windows.
The handling of unwind info is broken, so disable it for now.
2020-05-19 13:49:03 -07:00
Amara Emerson 665da59685 [AArch64][GlobalISel] Add legalizer & selector support for G_FREEZE.
These should legalize like undefs and select into copies.

The ll test is copied from the x86 test, minus the half fp case because
we don't currently support that.
2020-05-18 16:25:33 -07:00
Matt Arsenault ae98939172 GlobalISel: Fold G_MUL x, 0, and G_*DIV 0, x 2020-05-18 18:08:26 -04:00
Francesco Petrogalli b572d9b1a7 [llvm][sve] Intrinsics for SVE sudot and usdot instructions.
Summary:
This patch adds IR intrinsics for the mnemonics USDOT and SUDOT of the
8.6 extension of Armv8-a.

Reviewers: sdesmalen, efriedma, david-arm

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79876
2020-05-18 22:02:19 +00:00
Francesco Petrogalli 01f9d8ce5c [llvm][SVE] IR intrinscs for matrix multiplication instructions.
Summary:
Instructions:

* SMMLA
* UMMLA
* USMMLA
* FMMLA

Reviewers: sdesmalen, efriedma, kmclaughlin

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79638
2020-05-18 22:02:19 +00:00
Amara Emerson 17842025ed [GlobalISel] Add support for using vector values in memset inlining. 2020-05-18 14:56:16 -07:00
Jay Foad 9a05547954 [AArch64] Precommit tests for D77316 2020-05-16 16:00:02 +01:00
Ten Tzen e32f8e5d4a [Windows EH] Fix the order of Nested try-catches in $tryMap$ table
This bug is exposed by Test7 of ehthrow.cxx in MSVC EH suite where
a rethrow occurs in a try-catch inside a catch (i.e., a nested Catch
handlers). See the test code in
https://github.com/microsoft/compiler-tests/blob/master/eh/ehthrow.cxx#L346

When an object is rethrown in a Catch handler, the copy-ctor of this
object must be executed after the destructions of live objects, but
BEFORE the dtors of live objects in parent handlers.

Today Windows 64-bit runtime (__CxxFrameHandler3 & 4) expects nested Catch
handers
are stored in pre-order (outer first, inner next) in $tryMap$ table, so
that given a State, its Catch's beginning State can be properly
retrieved. The Catch beginning state (which is also the ending State) is
the State where rethrown object's copy-ctor must take place.

LLVM currently stores nested catch handlers in post-ordering because
it's the natural way to compute the highest State in Catch.
The fix is to simply store TryCatch handler in pre-order, but update
Catch's highest State after child Catches are all processed.

Differential Revision: https://reviews.llvm.org/D79474?id=263919
2020-05-15 22:03:43 -07:00
Eli Friedman a1ce88b4e3 [AArch64][SVE] Implement AArch64ISD::SETCC_PRED
This unifies SETCC operations along the lines of other operations.

Differential Revision: https://reviews.llvm.org/D79975
2020-05-15 11:53:21 -07:00
Konstantin Schwarz 5425cdc3ad [GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified
Summary:
D78319 introduced basic support for inline asm input operands in GlobalISel.
However, that patch did not handle the case where a memory input operand still needs to
be indirectified. Later code asserts that the memory operand is already indirect.

This patch adds an early return false to trigger the SelectionDAG fallback for now.

Reviewers: arsenm, paquette

Reviewed By: arsenm

Subscribers: thakis, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79955
2020-05-15 13:37:06 +02:00
David Sherwood 525b8e6dcb [SVE] Fix wrong usage of getNumElements() in matchIntrinsicType
I have changed the ScalableVecArgument case in matchIntrinsicType
to create a new FixedVectorType. This means that the next case we
hit (Vector) will not assert when calling getNumElements(), since
we know that it's always a FixedVectorType. This is a temporary
measure for now, and it will be fixed properly in another patch
that refactors this code.

The changes are covered by this existing test:

CodeGen/AArch64/sve-intrinsics-fp-converts.ll

In addition, I have added a new test to ensure that we correctly
reject SVE intrinsics when called with fixed length vector types.

Differential Revision: https://reviews.llvm.org/D79416
2020-05-15 08:44:59 +01:00
Nico Weber e0c1554274 Revert "[GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified"
This reverts commit 887dfeec53.
It broke irtranslator-inline-asm.ll on many bots, e.g.
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/38606/steps/test-check-all/logs/FAIL%3A%20LLVM%3A%3Airtranslator-inline-asm.ll
2020-05-14 19:37:05 -04:00
Konstantin Schwarz 887dfeec53 [GlobalISel][InlineAsm] Add early return for memory inputs that need to be indirectified
Summary:
D78319 introduced basic support for inline asm input operands in GlobalISel.
However, that patch did not handle the case where a memory input operand still needs to
be indirectified. Later code asserts that the memory operand is already indirect.

This patch adds an early return false to trigger the SelectionDAG fallback for now.

Reviewers: arsenm, paquette

Reviewed By: arsenm

Subscribers: wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79955
2020-05-14 23:42:31 +02:00
Cameron McInally b085e51d81 [AArch64][SVE] Add some integer DestructiveBinaryComm* patterns
Add DestructiveBinaryComm* patterns for ADD, SUB, and SUBR.

Differential Revision: https://reviews.llvm.org/D76711
2020-05-14 16:35:49 -05:00
Eli Friedman 4532a50899 Infer alignment of unmarked loads in IR/bitcode parsing.
For IR generated by a compiler, this is really simple: you just take the
datalayout from the beginning of the file, and apply it to all the IR
later in the file. For optimization testcases that don't care about the
datalayout, this is also really simple: we just use the default
datalayout.

The complexity here comes from the fact that some LLVM tools allow
overriding the datalayout: some tools have an explicit flag for this,
some tools will infer a datalayout based on the code generation target.
Supporting this properly required plumbing through a bunch of new
machinery: we want to allow overriding the datalayout after the
datalayout is parsed from the file, but before we use any information
from it. Therefore, IR/bitcode parsing now has a callback to allow tools
to compute the datalayout at the appropriate time.

Not sure if I covered all the LLVM tools that want to use the callback.
(clang? lli? Misc IR manipulation tools like llvm-link?). But this is at
least enough for all the LLVM regression tests, and IR without a
datalayout is not something frontends should generate.

This change had some sort of weird effects for certain CodeGen
regression tests: if the datalayout is overridden with a datalayout with
a different program or stack address space, we now parse IR based on the
overridden datalayout, instead of the one written in the file (or the
default one, if none is specified). This broke a few AVR tests, and one
AMDGPU test.

Outside the CodeGen tests I mentioned, the test changes are all just
fixing CHECK lines and moving around datalayout lines in weird places.

Differential Revision: https://reviews.llvm.org/D78403
2020-05-14 13:03:50 -07:00
Konstantin Schwarz 91063cf85a [GlobalISel][InlineAsm] Add support for basic input operand constraints
Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette

Reviewed By: arsenm

Subscribers: gargaroff, wdng, rovka, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78319
2020-05-14 10:43:37 +02:00
Michael Berg a255870f03 Propagate MIFlags in table gen
Summary: Add flag propagation to tablegen via OutMIs from originating MI in InstructionSelector::executeMatchTable.

Reviewers: dsanders, volkan

Reviewed By: dsanders

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74988
2020-05-13 18:32:59 -07:00
Wei Zhao 382d3a85e2 [AARch64] Add Marvell ThunderX3T110 support
This is the first checkin to support Marvell ThunderX3T110.

Initial definition of the micro-ops of the instructions in ThunderX3T110
is included.

Differential Revision: https://reviews.llvm.org/D78129
2020-05-13 16:58:51 -07:00
Florian Hahn 824a859332 [AArch64] Don't promote constants with float ConstantExpr.
Currently the AsmPrinter cannot emit some floating point constant
expressions in global initializers. Avoid generating them.

Reviewers: dmgreen, t.p.northover, arsenm, efriedma, Gerolf

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D79865
2020-05-13 23:31:47 +01:00
Eli Friedman a52f10b5a3 [AArch64][SVE] Add patterns for VSELECT of immediate merged with a variable.
This covers forms involving "CPY (immediate, merging)".

Differential Revision: https://reviews.llvm.org/D79803
2020-05-13 15:02:08 -07:00
Davide Italiano a9e8562651 [GIsel] Update a comment and make it more precise.
This only covers ANYEXT/ZEXT. SEXT is covered in another test
I just checked in.
2020-05-12 15:38:20 -07:00
Davide Italiano 99d60a1d0b [GlobalISel] Assign the correct location when combining G_SEXT.
<rdar://problem/62991635>
2020-05-12 15:32:18 -07:00
Jay Foad 989be65b11 [GlobalISel][IRTranslator] Fix <1 x Ty> handling in ConstantExprs
Summary:
ConstantExprs involving operations on <1 x Ty> could translate into MIR
that failed to verify with:
*** Bad machine code: Reading virtual register without a def ***

The problem was that translate(const Constant &C, Register Reg) had
recursive calls that passed the same Reg in for the translation of a
subexpression, but without updating VMap for the subexpression first as
translate(const Constant &C, Register Reg) expects.

Fix this by using the same translateCopy helper function that we use for
translating Instructions. In some cases this causes extra G_COPY
MIR instructions to be generated.

Fixes https://bugs.llvm.org/show_bug.cgi?id=45576

Reviewers: arsenm, volkan, t.p.northover, aditya_nandakumar

Subscribers: jvesely, wdng, nhaehnle, rovka, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78378
2020-05-12 16:51:03 +01:00
Sander de Smalen 077d2d6802 [CodeGen][SVE] Add patterns for whole vector predicate select
Added patterns to implement `select i1 %p, <vty> %a, <vty> %b`

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79356
2020-05-12 11:47:39 +01:00
Petre-Ionut Tudor 9682d0d5dc [ARM] Refactor lower to S[LR]I optimization
Summary:
The optimization has been refactored to fix certain bugs and
limitations. The condition for lowering to S[LR]I has been changed
to reflect the manual pseudocode description of SLI and SRI operation.
The optimization can now handle more cases of operand type and order.

Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D79233
2020-05-12 11:00:13 +01:00
David Sherwood 42c7a6d52b [CodeGen] Fix incorrect uses of getVectorNumElements()
I have fixed up some places in SelectionDAG::getNode() where we
used to assert that the number of vector elements for two types
are the same. I have changed such cases to assert that the
element counts are the same instead. I've added new tests that
exercise the code paths for all the truncations. All the extend
operations are covered by this existing test:

  CodeGen/AArch64/sve-sext-zext.ll

For the ISD::SETCC case I fixed this code path is exercised by
these existing tests:

  CodeGen/AArch64/sve-fcmp.ll
  CodeGen/AArch64/sve-intrinsics-int-compares-with-imm.ll

Differential Revision: https://reviews.llvm.org/D79399
2020-05-12 07:50:37 +01:00
Eli Friedman c9c930ae67 [SelectionDAG] Don't promote the alignment of allocas beyond the stack alignment.
allocas in LLVM IR have a specified alignment. When that alignment is
specified, the alloca has at least that alignment at runtime.

If the specified type of the alloca has a higher preferred alignment,
SelectionDAG currently ignores that specified alignment, and increases
the alignment. It does this even if it would trigger stack realignment.
I don't think this makes sense, so this patch changes that.

I was looking into this for SVE in particular: for SVE, overaligning
vscale'ed types is extra expensive because it requires realigning the
stack multiple times, or using dynamic allocation. (This currently isn't
implemented.)

I updated the expected assembly for a couple tests; in particular, for
arg-copy-elide.ll, the optimization in question does not increase the
alignment the way SelectionDAG normally would. For the rest, I just
increased the specified alignment on the allocas to match what
SelectionDAG was inferring.

Differential Revision: https://reviews.llvm.org/D79532
2020-05-11 17:39:00 -07:00
Eli Friedman a8874c76e8 [AArch64][SVE] Add patterns for VSELECT of immediates.
This covers forms involving "CPY (immediate, zeroing)".

This doesn't handle the case where the operands are reversed, and the
condition is freely invertible.  Not sure how to handle that.  Maybe a
DAGCombine.

Differential Revision: https://reviews.llvm.org/D79598
2020-05-11 17:04:22 -07:00
Davide Italiano 288c9e8178 [GlobalISel] Remove debug locations when emitting G_FCONSTANT.
<rdar://problem/62991543>
2020-05-11 16:25:03 -07:00
Jessica Paquette cd59458f27 [AArch64][GlobalISel] Make LR livein to entry in llvm.returnaddress selection
This fixes a couple verifier failures on this bot:

http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-globalisel-O0-g/

The failures show up in eeprof-1.c and pr17377.c in the GCC C Torture Suite.

Specifically:

*** Bad machine code: MBB has allocatable live-in, but isn't entry or landing-pad. ***
- function:    foo
- basic block: %bb.3 if.end (0x7fac7106dfc8)
- p. register: $lr

and

*** Bad machine code: Using an undefined physical register ***
- function:    f
- basic block: %bb.1 entry (0x7f8941092588)
- instruction: %18:gpr64 = COPY $lr
- operand 1:   $lr

Unlike SDAG, we were setting LR as a live in to the block containing the
returnaddress.

Also, this ensures that we don't add LR as a livein to the entry block twice.
In MachineBasicBlock.h there's a comment saying

"Note that it is an error to add the same register to the same set more than
once unless the intention is to call sortUniqueLiveIns after all registers are
added."

so it's probably good to avoid adding LR twice.

Surprisingly the verifier doesn't complain about that. Maybe it should.

Differential Revision: https://reviews.llvm.org/D79657
2020-05-11 11:32:12 -07:00
Matt Arsenault ee1a69824d GlobalISel: Combine G_UNMERGE_VALUES with G_TRUNC
G_BITCAST can be lowered with a pair of G_UNMERGE_VALUES and
G_MERGE_VALUES with different types, but G_UNMERGE_VALUES of a vector
can also be implemented with a bitcast to a scalar, which introduces
the possibility for infinite loops. Try to eliminate an illegal source
register type in the artifact combiner to avoid this from happening.

Avoids infinite looping in the legalizer in a future patch which
allows lowering G_UNMERGE_VALUES of a vector source with a G_BITCAST.
2020-05-09 16:14:32 -04:00
Jessica Paquette f66309deab [GlobalISel] Don't add duplicate successors to MBBs when translating indirectbr
This fixes a verifier failure on a bot:

http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-aarch64-O0-g/

```
*** Bad machine code: MBB has duplicate entries in its successor list. ***
- function:    foo
- basic block: %bb.5 indirectgoto (0x7fe3d687ca08)
```

One of the GCC torture suite tests (pr70460.c) has an indirectbr instruction
which has duplicate blocks in its destination list.

According to the langref this is allowed:

> Blocks are allowed to occur multiple times in the destination list, though
> this isn’t particularly useful.
(https://www.llvm.org/docs/LangRef.html#indirectbr-instruction)

We don't allow this in MIR. So, when we translate such an instruction, the
verifier screams.

This patch makes `translateIndirectBr` check if a successor has already been
added to a block. If the successor is present, it is skipped rather than added
twice.

Differential Revision: https://reviews.llvm.org/D79609
2020-05-08 13:40:02 -07:00
Hiroshi Yamauchi 1b4e3def03 [BFI][CGP] Add limited support for detecting missed BFI updates and fix one in CodeGenPrepare.
Summary:
This helps detect some missed BFI updates during CodeGenPrepare.

This is debug build only and disabled behind a flag.

Fix a missed update in CodeGenPrepare::dupRetToEnableTailCallOpts().

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77417
2020-05-07 11:58:00 -07:00