Commit Graph

49684 Commits

Author SHA1 Message Date
Craig Topper 6428a2cd9a [X86] Add custom promotion of v2i8/v2i16 fp_to_sint to avoid over promotion to v2i64 which would force scalarization.
llvm-svn: 346259
2018-11-06 19:24:21 +00:00
Matthias Braun c6613879ce LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFC
Change the type in a couple of lists and sets that only store physical
registers from unsigned to MCPhysRegs. The later is only 16bits and
saves us a bit of memory.

llvm-svn: 346254
2018-11-06 19:00:11 +00:00
Simon Atanasyan bb36aea1d5 [mips] Support sigrie instruction
The `sigrie` instruction signals a Reserved Instruction Exception.
This patch adds support for assembling / disassembling the instruction.

Differential Revision: http://reviews.llvm.org/D53861

llvm-svn: 346230
2018-11-06 14:37:24 +00:00
Clement Courbet 54a1184fff [X86][NFC] Fix comment.
llvm-svn: 346226
2018-11-06 13:48:56 +00:00
Matthias Braun 96d12513a1 AArch64: Cleanup CCMP code; NFC
Cleanup CCMP pattern matching code in preparation for review/bugfix:
- Rename `isConjunctionDisjunctionTree()` to `canEmitConjunction()`
  (it won't accept arbitrary disjunctions and is really about whether we
   can transform the subtree into a conjunction that we can emit).
- Rename `emitConjunctionDisjunctionTree()` to `emitConjunction()`

llvm-svn: 346203
2018-11-06 03:15:22 +00:00
Sam Clegg 5292d17ec8 Revert "[WebAssembly] Fixup `main` signature by default"
This reverts rL345880.  It caused some test failures on the
webassembly waterfall.  e.g. binaryen2.test_mainenv fails due
the fact that `envp` ends up being undef rather than 0.

Differential Revision: https://reviews.llvm.org/D54117

llvm-svn: 346187
2018-11-06 00:31:02 +00:00
Matthias Braun 7a75a91b5b MachineFunction: Store more specific reference to LLVMTargetMachine; NFC
MachineFunction can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.

Do the same for references in ScheduleDAG and RegUsageInfoCollector.

llvm-svn: 346183
2018-11-05 23:49:14 +00:00
Craig Topper 0b5f8169b0 [TargetLowering] Change TargetLoweringBase::getPreferredVectorAction to take an MVT instead of an EVT. NFC
The main caller of this already has an MVT and several targets called getSimpleVT inside without checking isSimple. This makes the simpleness explicit.

llvm-svn: 346180
2018-11-05 23:26:13 +00:00
Konstantin Zhuravlyov 108927b944 AMDGPU: Add sram-ecc feature
Differential Revision: https://reviews.llvm.org/D53222

llvm-svn: 346177
2018-11-05 22:44:19 +00:00
Craig Topper def82a81af [X86] Don't turn any_extend from a mask register into a sign_extend during lowering. Add patterns to match any_extend during isel instead.
SimplifyDemandedBits can turn a sign_extend back into an any_extend and trigger an infinite loop. So instead legalize it the same way as a sign_extend, but preserve the opcode. Then just pattern match it the same as sign_extend during isel.

I don't have a reduced test case for such an infinite loop yet.

llvm-svn: 346170
2018-11-05 22:08:17 +00:00
Zaara Syeda 7509880b54 [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics
On Power9, we don't have patterns to select the following intrinsics:
llvm.ppc.vsx.stxvw4x.be
llvm.ppc.vsx.stxvd2x.be

This patch adds support for these.

Differential Revision: https://reviews.llvm.org/D53581

llvm-svn: 346148
2018-11-05 17:31:26 +00:00
Stefan Maksimovic 8d7c351799 [Mips] Supplement long branch pseudo instructions
Expand on LONG_BRANCH_LUi and LONG_BRANCH_(D)ADDiu pseudo
instructions by creating variants which support
less operands/accept GPR64Opnds as their operand in order
to appease the machine verifier pass.

Differential Revision: https://reviews.llvm.org/D53977

llvm-svn: 346133
2018-11-05 14:37:41 +00:00
Neil Henning 233a02d0ed [AMDGPU] Fix the new atomic optimizer in pixel shaders.
The new atomic optimizer I previously added in D51969 did not work
correctly when a pixel shader was using derivatives, and had helper
lanes active.

To fix this we add an llvm.amdgcn.ps.live call that guards a branch
around the entire atomic operation - ensuring that all helper lanes are
inactive within the wavefront when we compute our atomic results.

I've added a test case that can cause derivatives, and exposes the
problem.

Differential Revision: https://reviews.llvm.org/D53930

llvm-svn: 346128
2018-11-05 12:04:48 +00:00
Sam Parker fec793c98f [ARM] Turn assert into condition in ARMCGP
Turn the assert in PrepareConstants into a conditon so that we can
handle mul instructions with negative immediates.

Differential Revision: https://reviews.llvm.org/D54094

llvm-svn: 346126
2018-11-05 11:26:04 +00:00
Sam Parker fcd8adab30 [ARM][ARMCGP] Remove unecessary zexts and truncs
r345840 slightly changed the way promotion happens which could
result in zext and truncs having the same source and destination
types. This fixes that issue.

We can now also remove the zext and trunc in the following case:
(zext (trunc (promoted op)), i32)

This means that we can no longer treat a value, that is only used by
a sink, to be safe to promote.

I've also added in some extra asserts and replaced a cast for a
dyn_cast.

Differential Revision: https://reviews.llvm.org/D54032

llvm-svn: 346125
2018-11-05 10:58:37 +00:00
Dylan McKay 4c5a5c8db6 [AVR] Fix a backend bug that left extraneous operands after expansion
This patch fixes a bug in the AVR FRMIDX expansion logic.

The expansion would leave a leftover operand from the original FRMIDX,
but now attached to a MOVWRdRr instruction. The MOVWRdRr instruction
did not expect this operand and so LLVM rejected the machine
instruction.

This would trigger an assertion:

    Assertion failed: ((isImpReg || Op.isRegMask() || MCID->isVariadic() ||
                        OpNo < MCID->getNumOperands() || isMetaDataOp) &&
                        "Trying to add an operand to a machine instr that is already done!"),
    function addOperand, file llvm/lib/CodeGen/MachineInstr.cpp

Tim fixed this so that now the FRMIDX is expanded correctly into
a well-formed MOVWRdRr.

Patch by Tim Neumann

llvm-svn: 346117
2018-11-05 05:49:04 +00:00
Craig Topper 30b627e5c9 [X86] Custom type legalize v2i8/v2i16/v2i32 mul to use to pmuludq.
v2i8/v2i16/v2i32 are promoted to v2i64. pmuludq takes a v2i64 input and produces a v2i64 output. Since we don't about the upper bits of the type legalized multiply we can use the pmuludq to produce the multiply result for the bits we do care about.

llvm-svn: 346115
2018-11-05 05:02:12 +00:00
Dylan McKay 9a9ae99b30 [AVR] Disallow the LDDWRdPtrQ instruction with Z as the destination
This is an AVR-specific workaround for a limitation of the register
allocator that only exposes itself on targets with high register
contention like AVR, which only has three pointer registers.

The three pointer registers are X, Y, and Z.
In most nontrivial functions, Y is reserved for the frame pointer,
as per the calling convention. This leaves X and Z. Some instructions,
such as LPM ("load program memory"), are only defined for the Z
register. Sometimes this just leaves X.

When the backend generates a LDDWRdPtrQ instruction with Z as the
destination pointer, it usually trips up the register allocator
with this error message:

  LLVM ERROR: ran out of registers during register allocation

This patch is a hacky workaround. We ban the LDDWRdPtrQ instruction
from ever using the Z register as an operand. This gives the
register allocator a bit more space to allocate, fixing the
regalloc exhaustion error.

Here is a description from the patch author Peter Nimmervoll

  As far as I understand the problem occurs when LDDWRdPtrQ uses
  the ptrdispregs register class as target register. This should work, but
  the allocator can't deal with this for some reason. So from my testing,
  it seams like (and I might be totally wrong on this) the allocator reserves
  the Z register for the ICALL instruction and then the register class
  ptrdispregs only has 1 register left and we can't use Y for source and
  destination. Removing the Z register from DREGS fixes the problem but
  removing Y register does not.

More information about the bug can be found on the avr-rust issue
tracker at https://github.com/avr-rust/rust/issues/37.

A bug has raised to track the removal of this workaround and a proper
fix; PR39553 at https://bugs.llvm.org/show_bug.cgi?id=39553.

Patch by Peter Nimmervoll

llvm-svn: 346114
2018-11-05 05:00:44 +00:00
Craig Topper ed6a0a817f [X86] Add vector shift by immediate to SimplifyDemandedBitsForTargetNode.
Summary: This also enables some constant folding from KnownBits propagation. This helps on some cases vXi64 case in 32-bit mode where constant vectors appear as vXi32 and a bitcast. This can prevent getNode from constant folding sra/shl/srl.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54069

llvm-svn: 346102
2018-11-04 17:31:27 +00:00
Craig Topper 1ba86188cf [SelectionDAG] Remove special methods for creating *_EXTEND_VECTOR_INREG nodes. Move asserts into getNode.
These methods were just wrappers around getNode with additional asserts (identical and repeated 3 times). But getNode already has a switch that can be used to hold these asserts that allows them to be shared for all 3 opcodes. This also enables checking on the places that create these nodes without using the wrappers.

The rest of the patch is just changing all callers to use getNode directly.

llvm-svn: 346087
2018-11-04 02:10:18 +00:00
Craig Topper 7aed9e600b [X86] Update comment I forgot to change in r346043. NFC
llvm-svn: 346073
2018-11-03 19:49:13 +00:00
Reid Kleckner 2bcb288ade [codeview] Let the X86 backend tell us the VFRAME offset adjustment
Use MachineFrameInfo's OffsetAdjustment field to pass this information
from the target to CodeViewDebug.cpp. The X86 backend doesn't use it for
any other purpose.

This fixes PR38857 in the case where there is a non-aligned quantity of
CSRs and a non-aligned quantity of locals.

llvm-svn: 346062
2018-11-03 00:41:52 +00:00
Craig Topper f7108aef14 [X86] In LowerEXTEND_VECTOR_INREG, emit a vector shuffle instead of directly using X86ISD::UNPCKL
The majority of the changes are because the rest of shuffle lowering/combining prefers to replace the undef input with the other operand. Using UNPCKL directly seemed to avoid this and just grabbed a randomish register for the undef which can create false dependencies.

llvm-svn: 346050
2018-11-02 22:48:02 +00:00
Wouter van Oortmerssen de28b5d17f [WebAssembly] Parsing missing directives to produce valid .o
Summary:
The assembler was able to assemble and then dump back to .s, but
was failing to parse certain directives necessary for valid .o
output:
- .type directives are now recognized to distinguish function symbols
  and others.
- .size is now parsed to provide function size.
- .globaltype (introduced in https://reviews.llvm.org/D54012) is now
  recognized to ensure symbols like __stack_pointer have a proper type
  set for both .s and .o output.

Also added tests for the above.

Reviewers: sbc100, dschuff

Subscribers: jgravelle-google, aheejin, dexonsmith, kristina, llvm-commits, sunfish

Differential Revision: https://reviews.llvm.org/D53842

llvm-svn: 346047
2018-11-02 22:04:33 +00:00
Craig Topper 60c202a494 [X86] Don't emit *_extend_vector_inreg nodes when both the input and output types are legal with AVX1
We already have custom lowering for the AVX case in LegalizeVectorOps. So its better to keep the regular extend op around as long as possible.

I had to qualify one place in DAG combine that created illegal vector extending load operations. This change by itself had no effect on any tests which is why its included here.

I've made a few cleanups to the custom lowering. The sign extend code no longer creates an identity shuffle with undef elements. The zero extend code now emits a zero_extend_vector_inreg instead of an unpckl with a zero vector.

For the high half of the custom lowering of zero_extend/any_extend, we're now using an unpckh with a zero vector or undef. Previously we used used a pshufd to move the upper 64-bits to the lower 64-bits and then used a zero_extend_vector_inreg. I think the zero vector should require less execution resources and be smaller code size.

Differential Revision: https://reviews.llvm.org/D54024

llvm-svn: 346043
2018-11-02 21:09:49 +00:00
Alex Bradbury 52c27785ce [RISCV] Add some missing expansions for floating-point intrinsics
A number of intrinsics, such as llvm.sin.f32, would result in a failure to 
select. This patch adds expansions for the relevant selection DAG nodes, as 
well as exhaustive testing for all f32 and f64 intrinsics.

The codegen for FMA remains a TODO item, pending support for the various 
RISC-V FMA instruction variants.

The llvm.minimum.f32.* and llvm.maximum.* tests are commented-out, pending 
upstream support for target-independent expansion, as discussed in 
http://lists.llvm.org/pipermail/llvm-dev/2018-November/127408.html.

Differential Revision: https://reviews.llvm.org/D54034
Patch by Luís Marques.

llvm-svn: 346034
2018-11-02 19:50:38 +00:00
Heejin Ahn 5b023e07ea [WebAssembly] Fix bugs in rethrow depth counting and InstPrinter
Summary:
EH stack depth is incremented at `try` and decremented at `catch`. When
there are more than two catch instructions for a try instruction, we
shouldn't count non-first catches when calculating EH stack depths.

This patch fixes two bugs:
- CFGStackify: Exclude `catch_all` in the terminate catch pad when
  calculating EH pad stack, because when we have multiple catches for a
  try we should count only the first catch instruction when calculating
  EH pad stack.
- InstPrinter: The initial intention was also to exclude non-first
  catches, but it didn't account nested try-catches, so it failed on
  this case:
```
try
  try
  catch
  end
catch    <-- (1)
end
```
In the example, when we are at the catch (1), the last seen EH
instruction is not `try` but `end_try`, violating the wrong assumption.

We don't need these after we switch to the second proposal because there
is gonna be only one `catch` instruction. But anyway before then these
bugfixes are necessary for keep trunk in working state.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53819

llvm-svn: 346029
2018-11-02 18:38:52 +00:00
Matthias Braun 5f7cb79e94 ARMExpandPseudoInsts: Fix CMP_SWAP expansion adding a kill flag to a def
llvm-svn: 346026
2018-11-02 18:22:15 +00:00
Jonas Paulsson cced2a2775 [SystemZ::TTI] Improve cost handling of uint/sint to fp conversions.
Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load.

Since the load already does the extension, there is no extra cost (previously
returned 2).

Review: Ulrich Weigand
https://reviews.llvm.org/D54028

llvm-svn: 346009
2018-11-02 17:53:31 +00:00
Sylvestre Ledru df92dabaef Fixed inclusion of M_PI fow MinGW-w64
Patch by KOLANICH

llvm-svn: 346000
2018-11-02 17:25:40 +00:00
Jonas Paulsson 79f2441eee [SystemZ] Rework getInterleavedMemoryOpCost()
Model this function more closely after the BasicTTIImpl version, with
separate handling of loads and stores. For loads, the set of actually loaded
vectors is checked.

This makes it more readable and just slightly more accurate generally.

Review: Ulrich Weigand
https://reviews.llvm.org/D53071

llvm-svn: 345998
2018-11-02 17:15:36 +00:00
Krzysztof Parzyszek f070544f8e [Hexagon] Do not reduce load size for globals in small-data
Small-data (i.e. GP-relative) loads and stores allow 16-bit scaled
offset. For a load of a value of type T, the small-data area is
equivalent to an array "T sdata[65536]". This implies that objects
of smaller sizes need to be closer to the beginning of sdata,
while larger objects may be farther away, or otherwise the offset
may be insufficient to reach it. Similarly, an object of a larger
size should not be accessed via a load of a smaller size.

llvm-svn: 345975
2018-11-02 14:17:47 +00:00
Alexey Bataev 8831ef7a16 [DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or only debug directives are requested.
Summary:
If the output of debug directives only is requested, we should drop
emission of ',debug' option from the target directive. Required for
supporting of nvprof profiler.

Reviewers: probinson, echristo, dblaikie

Subscribers: Hahnfeld, jholewinski, llvm-commits, JDevlieghere, aprantl

Differential Revision: https://reviews.llvm.org/D46061

llvm-svn: 345972
2018-11-02 13:47:47 +00:00
Neil Henning 7d1b77df57 [AMDGPU] UBSan bug fix for r345710
UBSan detected an error in our ISelLowering that is exposed only when
you have a dmask == 0x1. Fix this by adding in an explicit check to
ensure we don't do the UBSan detected shl << 32.

llvm-svn: 345962
2018-11-02 10:24:57 +00:00
Matt Arsenault 8e0269ba0b AMDGPU: Fix assertion with bitcast from i64 constant to v4i16
llvm-svn: 345922
2018-11-02 02:43:55 +00:00
Wouter van Oortmerssen 3231e518a3 [WebAssembly] Added a .globaltype directive to .s output.
Summary:
Assembly output can use globals like __stack_pointer implicitly,
but has no way of indicating the type of such a global, which makes
it hard for tools processing it (such as the MC Assembler) to
reconstruct this information.

The improved assembler directives parsing (in progress in
https://reviews.llvm.org/D53842) will make use of this information.

Also deleted code for the .import_global directive which was unused.

New test case in userstack.ll

Reviewers: dschuff, sbc100

Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54012

llvm-svn: 345917
2018-11-02 00:45:00 +00:00
Thomas Lively b2382c8bf7 [WebAssembly] General vector shift lowering
Summary: Adds support for lowering non-splat shifts.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53625

llvm-svn: 345916
2018-11-02 00:39:57 +00:00
Thomas Lively fb84fd7c8e [WebAssembly] Expand inserts and extracts with variable indices
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53964

llvm-svn: 345913
2018-11-02 00:06:56 +00:00
Mandeep Singh Grang 547a0d765a [COFF, ARM64] Implement Intrinsic.sponentry for AArch64
Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform.

Patch by: Yin Ma (yinma@codeaurora.org)

Reviewers: mgrang, ssijaric, eli.friedman, TomTan, mstorsjo, rnk, compnerd, efriedma

Reviewed By: efriedma

Subscribers: efriedma, javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53996

llvm-svn: 345909
2018-11-01 23:22:25 +00:00
Farhana Aleen 5853762e5a [AMDGPU] Handle the idot8 pattern generated by FE.
Summary: Different variants of idot8 codegen dag patterns are not generated by llvm-tablegen due to a huge
         increase in the compile time. Support the pattern that clang FE generates after reordering the
         additions in integer-dot8 source language pattern.

Author: FarhanaAleen

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D53937

llvm-svn: 345902
2018-11-01 22:48:19 +00:00
Mandeep Singh Grang df19e57a1c [COFF, ARM64] Implement llvm.addressofreturnaddress intrinsic
Reviewers: rnk, mstorsjo, efriedma, TomTan

Reviewed By: efriedma

Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53962

llvm-svn: 345892
2018-11-01 21:23:47 +00:00
Heejin Ahn 2e398976ba [WebAssembly] Fix signature parsing for 'try' in AsmParser
Summary:
Like `block` or `loop`, `try` can take an optional signature which can
be omitted. This patch allows `try`'s signature to be omitted. Also
added some tests for EH instructions.

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53873

llvm-svn: 345888
2018-11-01 20:32:15 +00:00
Reid Kleckner 4af6025f09 [Hexagon] Remove unintended fallthrough from MC duplex code
I added these annotations in r345878 because I wasn't sure if the
fallthrough was intended. Krzysztof Parzyszek confirmed that they should
be breaks, so that's what this patch does.

Reviewers: kparzysz

Differential Revision: https://reviews.llvm.org/D53991

llvm-svn: 345883
2018-11-01 19:59:27 +00:00
Reid Kleckner 4dc0b1ac60 Fix clang -Wimplicit-fallthrough warnings across llvm, NFC
This patch should not introduce any behavior changes. It consists of
mostly one of two changes:
1. Replacing fall through comments with the LLVM_FALLTHROUGH macro
2. Inserting 'break' before falling through into a case block consisting
   of only 'break'.

We were already using this warning with GCC, but its warning behaves
slightly differently. In this patch, the following differences are
relevant:
1. GCC recognizes comments that say "fall through" as annotations, clang
   doesn't
2. GCC doesn't warn on "case N: foo(); default: break;", clang does
3. GCC doesn't warn when the case contains a switch, but falls through
   the outer case.

I will enable the warning separately in a follow-up patch so that it can
be cleanly reverted if necessary.

Reviewers: alexfh, rsmith, lattner, rtrieu, EricWF, bollu

Differential Revision: https://reviews.llvm.org/D53950

llvm-svn: 345882
2018-11-01 19:54:45 +00:00
Sam Clegg ddf049869a [WebAssembly] Fixup `main` signature by default
Differential Revision: https://reviews.llvm.org/D53396

llvm-svn: 345880
2018-11-01 19:38:44 +00:00
Reid Kleckner bebc53f838 Annotate possibly unintended fallthroughs in Hexagon MC code, NFC
Clang's -Wimplicit-fallthrough check fires on these switch cases. GCC
does not warn when a case body that ends in a switch falls through to a
case label of an outer switch.

It's not clear if these fall throughs are truly intended.  The Hexagon
tests pass regardless of whether these case blocks fall through or
break.

For now, I have applied the intended fallthrough annotation macro with a
FIXME comment to unblock enabling the warning. I will send a follow-up
patch that converts them to breaks to the Hexagon maintainers.

llvm-svn: 345878
2018-11-01 19:32:04 +00:00
Volkan Keles 0a8dc9eb0f [GlobalISel] Fix a bug in LegalizeRuleSet::clampMaxNumElements
Summary:
This function was causing a crash when `MaxElements == 1` because
it was trying to create a single element vector type.

Reviewers: dsanders, aemerson, aditya_nandakumar

Reviewed By: dsanders

Subscribers: rovka, kristof.beyls, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D53734

llvm-svn: 345875
2018-11-01 19:01:53 +00:00
Simon Pilgrim b34a052852 [LegalizeDAG] Add generic vector CTPOP expansion (PR32655)
This patch adds support for expanding vector CTPOP instructions and removes the x86 'bitmath' lowering which replicates the same expansion.

Differential Revision: https://reviews.llvm.org/D53258

llvm-svn: 345869
2018-11-01 18:22:11 +00:00
Reid Kleckner ba982b5f8f [Hexagon] Fix MO_JumpTable const extender conversion
Previously this case fell through to unreachable, so it is clearly not
covered by any test case in LLVM. It may be dynamically unreachable, in
fact. However, if it were to run, this is what it would logically do.
The assert suggests that the intended behavior was not to allow folding
offsets from jump table indices, which makes sense.

llvm-svn: 345868
2018-11-01 18:14:45 +00:00
Reid Kleckner eb56894a4b [AArch64] Fix unintended fallthrough and strengthen cast
This was added in r330630. GCC's -Wimplicit-fallthrough seems to not
fire when the previous case contains a switch itself.

This fallthrough was bening because the helper function implementing the
case used dyn_cast to re-check the type of the node in question. After
fixing the fallthrough, we can strengthen the cast.

llvm-svn: 345864
2018-11-01 18:02:27 +00:00