Commit Graph

140801 Commits

Author SHA1 Message Date
Christopher Tetreault 900ec97bbe [UBSan] Cannot negate smallest negative signed integer
Silence warning Undefined Behavior Sanitzer warning:
runtime error: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D90710
2020-11-04 10:07:52 -08:00
Craig Topper 3701e33a22 [RISCV] Remove custom isel for (srl (shl val, 32), imm). Use pattern instead. NFCI
We don't need custom matching, we just a need a predicate to check
the immediate is greater than 32. We can use the existing ImmSub32
to adjust the immediate.

I've also used the new predicate in the other location that used
ImmSub32. I tried to create a test case where we would break without
the greater than 32 check on that pattern, but DAG combine defeated me.
Still seemed safer to have it.

Differential Revision: https://reviews.llvm.org/D90546
2020-11-04 09:59:14 -08:00
Joe Nash 58adab34c4 [AMDGPU] Resolve pseudo registers at encoding uses
Pseudo-registers allow different register encodings
between gpu generations. Make sure we resolve the
pseudo regs to real regs whenever we get their
hardware encoding.
Using the correct encodings revealed a register
bank conflict and an unnecessary write dependency.
Tests have been updated to match.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D90721

Change-Id: I73c154cd24aecc820993b50bebaf4df97a5710ca
2020-11-04 12:52:32 -05:00
Fangrui Song bbeb08497c Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation"
This reverts commit 0b8711e1af which broke GlobalISelTests AArch64GISelMITest.TestKnownBits
2020-11-04 09:54:04 -08:00
Arthur Eubanks d8f531c42c [NewPM] Don't run before pass instrumentation on required passes
This allows those instrumentation to log when they decide to skip a
pass. This provides extra helpful info for optnone functions and also
will help with opt-bisect.

Have OptNoneInstrumentation print when it skips due to seeing optnone.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90545
2020-11-04 09:45:10 -08:00
Sebastian Neubauer 31a0b2834f [AMDGPU] Fix iterating in SIFixSGPRCopies
The insertion of waterfall loops splits the current basic block into
three blocks. So the basic block that we iterate over must be updated.

This failed assert(!NodePtr->isKnownSentinel()) in ilist_iterator for
divergent calls in branches before.

Differential Revision: https://reviews.llvm.org/D90596
2020-11-04 18:43:19 +01:00
Simon Pilgrim ecbd0413af [KnownBits] KnownBits::computeForMul - avoid unnecessary APInt copies. NFCI.
Use const references instead.
2020-11-04 17:25:25 +00:00
Simon Pilgrim 0b8711e1af [GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation
Avoid code duplication
2020-11-04 17:25:24 +00:00
Arnold Schwaighofer 42f1916640 Revert "Start of an llvm.coro.async implementation"
This reverts commit ea606cced0.

This patch causes memory sanitizer failures sanitizer-x86_64-linux-fast.
2020-11-04 08:26:20 -08:00
Arnold Schwaighofer ea606cced0 Start of an llvm.coro.async implementation
This patch adds the `async` lowering of coroutines.

This will be used by the Swift frontend to lower async functions. In
contrast to the `retcon` lowering the frontend needs to be in control
over control-flow at suspend points as execution might be suspended at
these points.

This is very much work in progress and the implementation will change as
it evolves with the frontend. As such the documentation is lacking
detail as some of it might change.

rdar://70097093

Differential Revision: https://reviews.llvm.org/D90612
2020-11-04 07:32:29 -08:00
Eric Astor bf027da04c [ms] [llvm-ml] Enable support for MASM-style macro procedures
Allows the MACRO directive to define macro procedures with parameters and macro-local symbols.

Supports required and optional parameters (including default values), and matches ml64.exe for its macro-local symbol handling (up to 65536 macro-local symbols in any translation unit).

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D89729
2020-11-04 10:29:57 -05:00
Simon Pilgrim 93c2a9ae07 Use isa<> instead of dyn_cast<> to avoid unused variable warning. NFCI. 2020-11-04 15:26:32 +00:00
Paul C. Anagnostopoulos d56cd4291e [TableGen] Add !interleave operator to concatenate a list of values with delimiters
Add a test. Use it in some TableGen files.

Differential Revision: https://reviews.llvm.org/D90469
2020-11-04 09:23:54 -05:00
Sanjay Patel c74db55ff5 [InstSimplify] allow vector folds for icmp Pred (1 << X), 0x80 2020-11-04 08:12:48 -05:00
Roman Lebedev 93f3d7f7b3
[Reassociate] Guard `add`-like `or` conversion into an `add` with profitability check
This is slightly better compile-time wise,
since we avoid potentially-costly knownbits analysis that will
ultimately not allow us to actually do anything with said `add`.
2020-11-04 16:10:34 +03:00
Simon Moll 351c10cc72 [VE] Add +vpu attribute
`+vpu` controls whether VEISelLowering adds any vregs.  This defaults to
`-vpu` to have scalar code generation out of the box.  We bring up
vector isel under the `+vpu` flag. Once vector isel is stable we switch
to `+vpu` and advertise vregs and vops in TTI.

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D90465
2020-11-04 12:42:00 +01:00
Kerry McLaughlin f2412d372d [SVE][CodeGen] Lower scalable integer vector reductions
This patch uses the existing LowerFixedLengthReductionToSVE function to also lower
scalable vector reductions. A separate function has been added to lower VECREDUCE_AND
& VECREDUCE_OR operations with predicate types using ptest.

Lowering scalable floating-point reductions will be addressed in a follow up patch,
for now these will hit the assertion added to expandVecReduce() in TargetLowering.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D89382
2020-11-04 11:38:49 +00:00
Simon Pilgrim 825e517e34 [DAG] computeKnownBits - Replace ISD::MUL handling with the common KnownBits::computeForMul implementation 2020-11-04 11:32:08 +00:00
Sebastian Neubauer 1124bf4ab7 [AMDGPU] Set rsrc1 flags for graphics shaders
Before they were only set for compute kernels and compute shaders but
not for other shaders.

Differential Revision: https://reviews.llvm.org/D89399
2020-11-04 12:25:41 +01:00
Sebastian Neubauer 76313288cd [AMDGPU] Fix ieee mode default value
Previously, the default value for ieee mode was
- on for compute kernels and compute shaders,
- off for all shaders except compute shaders.

This commit changes the default to be
- on for compute kernels,
- off for shaders.

This aligns the default value with the settings that are actually in
use.  To my knowledge, all users of shader calling conventions (mesa and
llpc) disable the ieee mode by default.

Differential Revision: https://reviews.llvm.org/D89388
2020-11-04 12:25:38 +01:00
David Green eb611930b6 [ARM] Remove unused variable. NFC 2020-11-04 09:00:03 +00:00
Sander de Smalen 73b6cb67dc [NFCI] Replace AArch64StackOffset by StackOffset.
This patch replaces the AArch64StackOffset class by the generic one
defined in TypeSize.h.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D88983
2020-11-04 08:49:00 +00:00
Fangrui Song 0314dff051 [DebugInfo] Delete unused DwarfUnit::addConstantFPValue & addConstantValue overloads. NFC
This functions appear to be unused for many years.
2020-11-04 00:05:57 -08:00
Martin Storsjö 36cf1e7d0e Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Arthur Eubanks 06926e0f01 Port print-must-be-executed-contexts and print-mustexecute to NPM
Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90207
2020-11-03 21:06:46 -08:00
Xiang1 Zhang 7ba3293691 [StackColoring] Conservatively merge catch point of V for catch(V)
Reviewed By: thanm, clin1

Differential Revision: https://reviews.llvm.org/D86673
2020-11-04 10:49:56 +08:00
Michael Liao 4b11201592 [MachineInstr] Add support for instructions with multiple memory operands.
- Basically iterate each pair of memory operands from both instructions
  and return true if any of them may alias.
- The exception are memory instructions without any memory operand. They
  may touch everything and could alias to any memory instruction.

Differential Revision: https://reviews.llvm.org/D89447
2020-11-03 20:44:40 -05:00
Gaurav Jain 492b1d78d5 [NFC] Use [MC]Register in register allocation
Differential Revision: https://reviews.llvm.org/D90725
2020-11-03 17:34:26 -08:00
Amara Emerson 393b55380a [AArch64][GlobalISel] Add combine for G_EXTRACT_VECTOR_ELT to allow selection of pairwise FADD.
For the <2 x float> case, instead of adding another combine or legalization to
get it into a <4 x float> form, I'm just adding a GISel specific selection
pattern to cover it.

Differential Revision: https://reviews.llvm.org/D90699
2020-11-03 17:25:14 -08:00
Julien Jorge 0fca651711 [WebAssembly] Don't fold frame offset for global addresses
When machine instructions are in the form of
```
%0 = CONST_I32 @str
%1 = ADD_I32 %stack.0, %0
%2 = LOAD 0, 0, %1
```

In the `ADD_I32` instruction, it is possible to fold it if `%0` is a
`CONST_I32` from an immediate number. But in this case it is a global
address, so we shouldn't do that. But we haven't checked if the operand
of `ADD` is an immediate so far. This fixes the problem. (The case
applies the same for `ADD_I64` and `CONST_I64` instructions.)

Fixes https://bugs.llvm.org/show_bug.cgi?id=47944.

Patch by Julien Jorge (jjorge@quarkslab.com)

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D90577
2020-11-03 14:56:25 -08:00
Sanjay Patel c40126e740 [ARM] remove cost-kind predicate for most math op costs
This is based on the same idea that I am using for the basic model implementation
and what I have partly already done for x86: throughput cost is number of
instructions/uops, so size/blended costs are identical except in special cases
(for example, fdiv or other known-expensive machine instructions or things like
MVE that may require cracking into >1 uop)).

Differential Revision: https://reviews.llvm.org/D90692
2020-11-03 17:23:46 -05:00
Jordan Rupprecht 980bf1d5d1 [NFC] Inline wasm assertion-only variable 2020-11-03 13:06:59 -08:00
Xun Li 7f34aca083 [musttail] Unify musttail call preceding return checking
There is already an API in BasicBlock that checks and returns the musttail call if it precedes the return instruction.
Use it instead of manually checking in each place.

Differential Revision: https://reviews.llvm.org/D90693
2020-11-03 11:39:27 -08:00
Roman Lebedev 70472f34b2
[Reassociate] Convert `add`-like `or`'s into an `add`'s to allow reassociation
InstCombine is quite aggressive in doing the opposite transform,
folding `add` of operands with no common bits set into an `or`,
and that not many things support that new pattern..

In this case, teaching Reassociate about it is easy,
there's preexisting art for `sub`/`shl`:
just convert such an `or` into an `add`:
https://rise4fun.com/Alive/Xlyv
2020-11-03 22:30:51 +03:00
Sanne Wouda 2ec26d3a23 Revert "Add loop distribution to the LTO pipeline"
This reverts commit 6e80318eec.
2020-11-03 19:29:27 +00:00
Sanne Wouda 6e80318eec Add loop distribution to the LTO pipeline
The LoopDistribute pass is missing from the LTO pipeline, so
-enable-loop-distribute has no effect during post-link. The pre-link
loop distribution doesn't seem to survive the LTO pipeline either.

With this patch (and -flto -mllvm -enable-loop-distribute) we see a 43%
uplift on SPEC 2006 hmmer for AArch64. The rest of SPECINT 2006 is
unaffected.

Differential Revision: https://reviews.llvm.org/D89896
2020-11-03 18:54:24 +00:00
Andy Wingo 107c3a12d6 [WebAssembly] Implement ref.null
This patch adds a new "heap type" operand kind to the WebAssembly MC
layer, used by ref.null. Currently the possible values are "extern" and
"func"; when typed function references come, though, this operand may be
a type index.

Note that the "heap type" production is still known as "refedtype" in
the draft proposal; changing its name in the spec is
ongoing (https://github.com/WebAssembly/reference-types/issues/123).

The register form of ref.null is still untested.

Differential Revision: https://reviews.llvm.org/D90608
2020-11-03 10:46:23 -08:00
Jameson Nash 59a6ab28c4 [GVN] small improvements to comments 2020-11-03 13:21:48 -05:00
Simon Pilgrim 6dabc38cce Cleanup namespace comment to fix clang-tidy warning. NFCI. 2020-11-03 18:13:21 +00:00
Simon Pilgrim e9b88c754a [DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr
As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.
2020-11-03 18:09:33 +00:00
Craig Topper 00eff96e1d [RISCV] Add missing patterns for rotr with immediate for Zbb/Zbp extensions.
DAGCombine doesn't canonicalize rotl/rotr with immediate so we
need patterns for both.

Remove the custom matcher for rotl to RORI and just use a SDNodeXForm
to convert the immediate instead. Doing this gives priority to the
rev32/rev16 versions of grevi over rori since an explicit immediate
is more precise than any immediate. I also added rotr patterns for
rev32/rev16. And removed the (or (shl), (shr)) patterns that should be
combined to rotl by DAG combine.

There is at least one other grev pattern that probably needs a
another rotr pattern, but we need more test coverage first.

Differential Revision: https://reviews.llvm.org/D90575
2020-11-03 10:04:52 -08:00
Simon Pilgrim cb798f040a [DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.

The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.

We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
2020-11-03 17:30:36 +00:00
Esme-Yi 5053eab890 Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA."
This reverts commit 119ab2181e.
2020-11-03 16:34:02 +00:00
Tim Renouf 89d41f3a2b [AMDGPU] Add gfx1033 target
Differential Revision: https://reviews.llvm.org/D90447

Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761
2020-11-03 16:27:48 +00:00
Tim Renouf ee3e642627 [AMDGPU] Add gfx90c target
This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were
previously included in gfx909.

Differential Revision: https://reviews.llvm.org/D90419

Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d
2020-11-03 16:27:43 +00:00
Jay Foad 040c50278c [AMDGPU] Fix ds_read2/write2 with unaligned offsets
These instructions use a scaled offset. We were wrongly selecting them
even when the required offset was not a multiple of the scale factor.

Differential Revision: https://reviews.llvm.org/D90607
2020-11-03 15:16:10 +00:00
Jameson Nash a0ad066ce4 make the AsmPrinterHandler array public
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.

Replaces https://reviews.llvm.org/D74158

Differential Revision: https://reviews.llvm.org/D89613
2020-11-03 10:02:09 -05:00
Simon Pilgrim cab21d4fa8 [DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl
As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.

The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before.

We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.
2020-11-03 14:22:28 +00:00
Sanjay Patel 9af561ec99 [x86] update cost table comments for maxnum; NFC
Follow-up suggested in D90613.
2020-11-03 08:09:59 -05:00
Roman Lebedev c009d11bda
[InstCombine] Perform C-(X+C2) --> (C-C2)-X transform before using Negator
In particular, it makes it fire for C=0, because negator doesn't want
to perform that fold since in general it's not beneficial.
2020-11-03 16:06:52 +03:00