Commit Graph

26478 Commits

Author SHA1 Message Date
Eli Friedman 0bbb0d0720 [ARM] Add MemOperand to LDRcp to enable DCE.
LDRcp should be deleted when the dest register is dead in register
coalescing. Without MemOp, dead LDRcp will cause dead constant pool
value which references to non-existing label.

Patch by Yin Ma.

Differential Revision: https://reviews.llvm.org/D54173

llvm-svn: 346563
2018-11-09 23:09:17 +00:00
Thomas Lively 97bef83690 [WebAssembly] Disable custom NaN payload tests
Summary:
These tests fail on 32-bit builds because NaN payload bits in floating point
immediates are not necessarily preserved through compilation. This is because
the MC layer uses native doubles to store these values. The tests will be
reenabled once this problem has been fixed or deleted if we decide we don't care
about lowering payload bits.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54353

llvm-svn: 346558
2018-11-09 22:04:37 +00:00
Craig Topper 17d64c71c5 [X86] Move the promotion of v16i16->v16i8 for avx512f but not avx512bw from lowering to isel. Change to use vpmovzx instead of vpmovsx.
With avx512f but not avx512bw we need to extend to v16i32 then truncate that to to v16i8. Previously we emitted both nodes during lowering, but I'm trying to switch to using target independent nodes and with that switched the extend+truncate wou

This patch changes the implementation to what will be necessary with that patch which helps minimize test diffs.

llvm-svn: 346552
2018-11-09 20:09:53 +00:00
Bryan Chan 123553921f [AArch64] Support HiSilicon's TSV110 processor
Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls

Reviewed By: kristof.beyls

Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D53908

llvm-svn: 346546
2018-11-09 19:32:08 +00:00
Ulrich Weigand e32d129712 [SystemZ] Add a couple of missing tests
A few fp128 tests were omitted from test/CodeGen/SystemZ/fp-round-01.ll
since in early days, LLVM couldn't handle implicitly generated library
calls to functions with long double arguments on SystemZ.

This deficiency was actually long since fixed, but those tests are
still missing.  This patch adds the missing tests.  NFC.

llvm-svn: 346541
2018-11-09 19:16:21 +00:00
Paul Robinson ddbde9a4ad [DWARFv5] Emit normal type units in .debug_info comdats.
Differential Revision: https://reviews.llvm.org/D54282

llvm-svn: 346540
2018-11-09 19:06:09 +00:00
Craig Topper 731ea7dbc1 [X86] Turn X86ISD::VSEXT into X86ISD::VZEXT if the upper bits aren't demanded.
This makes X86ISD::VSEXT more similar to ISD::SIGN_EXTEND and ISD::ZERO_EXTEND.

I'm hoping to replace X86ISD::VSEXT/VZEXT with target independent nodes. Making the target specific nodes similar to the target independent nodes helps minimize test diffs in that patch.

llvm-svn: 346539
2018-11-09 19:05:51 +00:00
Stanislav Mekhanoshin 26299e2af1 [AMDGPU] Cleanup optimize-if-exec-masking.mir test. NFC.
llvm-svn: 346533
2018-11-09 18:23:39 +00:00
Brendon Cahoon ac8fed68d5 [Hexagon] Implement noreturn optimization
Eliminate the stack frame in functions with the noreturn nounwind
attributes, and when the noreturn-stack-elim target feature is
enabled. This reduces the code and stack space needed for noreturn
functions.

Differential Revision: https://reviews.llvm.org/D54210

llvm-svn: 346532
2018-11-09 18:16:24 +00:00
Craig Topper 9a7e19b8f2 [DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.

I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.

Differential Revision: https://reviews.llvm.org/D54283

llvm-svn: 346530
2018-11-09 18:04:34 +00:00
Krzysztof Parzyszek 8567de0871 [Hexagon] Place globals with explicit .sdata section in small data
Both -fPIC and -G0 disable placement of globals in small data section,
but if a global has an explicit section assigmnent placing it in small
data, it should go there anyway.

llvm-svn: 346523
2018-11-09 17:31:22 +00:00
Zaara Syeda 5c179bf14b [Power9] Allow gpr callee saved spills in prologue to vectors registers
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.

Differential Revision: https://reviews.llvm.org/D39386

llvm-svn: 346512
2018-11-09 16:36:24 +00:00
Jonas Paulsson 458b7c0b39 [SystemZ] Avoid inserting same value after replication
A minor improvement of buildVector() that skips creating an
INSERT_VECTOR_ELT for a Value which has already been used for the
REPLICATE.

Review: Ulrich Weigand
https://reviews.llvm.org/D54315

llvm-svn: 346504
2018-11-09 15:44:28 +00:00
Nicolai Haehnle d5199e39c0 AMDGPU: Add testcase to demonstrate a condition with pre-existing waitcnt
Relevant for https://reviews.llvm.org/D54226.

llvm-svn: 346501
2018-11-09 15:13:12 +00:00
Sam Parker 2804f32ec4 [ARM] Don't promote i1 types in ARM CGP
Now that we have mixed type sizes, i1 values need to be explicitly
handled as we want to avoid promoting these values.

Differential Revision: https://reviews.llvm.org/D54308

llvm-svn: 346499
2018-11-09 15:06:33 +00:00
Sanjay Patel fa1c0fe478 [x86] try to form broadcast before widening shuffle elements
I noticed that we weren't generating broadcasts as much I thought we would with 
D54271, and this is part of the problem.

Widening the shuffle elements means adding bitcasts and hiding the relationship 
between a splatted scalar and the vector. If we can form a broadcast, do that 
before going through the rest of the shuffle lowering because broadcasts should 
be cheap and can often be load-folded.

Differential Revision: https://reviews.llvm.org/D54280

llvm-svn: 346498
2018-11-09 14:54:58 +00:00
Alex Bradbury 1cc2d0b9fb [RISCV] Avoid unnecessary XOR for seteq/setne 0
Differential Revision: https://reviews.llvm.org/D53492

Patch by James Clarke.

llvm-svn: 346497
2018-11-09 14:47:36 +00:00
Alex Bradbury a9da1de5f4 [RISCV] Update test/CodeGen/RISCV/calling-conv.ll after rL346432
The DAGCombiner changes led to a different schedule.

llvm-svn: 346496
2018-11-09 14:35:44 +00:00
Alexandros Lamprineas e15c982f6d [SelectionDAG] swap select_cc operands to enable folding
The DAGCombiner tries to SimplifySelectCC as follows:

  select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)

It can't cope with the situation of reordered operands:

  select_cc(x, y, 0, 16, cc)

In that case we just need to swap the operands and invert the Condition Code:

  select_cc(x, y, 16, 0, ~cc)

Differential Revision: https://reviews.llvm.org/D53236

llvm-svn: 346484
2018-11-09 11:09:40 +00:00
Clement Courbet e6b727e552 [X86] Fix VZEROUPPER scheduling info on SNB,HSW,BDW,SXL,SKX.
Summary:
Starting from SNB, VZEROUPPER is handled by the renamer and uses no proc resources.
After HSW, it also has zero latency.

This fixes PR35606.

To reproduce:
Uops:
  llvm-exegesis -mode=uops -opcode-name=VZEROUPPER
Latency:
  echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper' | /tmp/llvm-exegesis -mode=latency -snippets-file=-
  echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper\naddps %xmm0, %xmm1' | /tmp/llvm-exegesis -mode=latency -snippets-file=-

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D54107

llvm-svn: 346482
2018-11-09 09:49:06 +00:00
Carlos Alberto Enciso fa9cf89734 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53390

llvm-svn: 346481
2018-11-09 09:42:10 +00:00
Sam Parker 08979cd125 [ARM] Enable mixed types in ARM CGP
Previously, during the search, all values had to have the same
'TypeSize', which is equal to number of bits of the integer type of
the icmp operand. All values in the tree had to match this size;
meaning that, if we searched from i16, we wouldn't accept i8s. A
change in type size requires zext and truncs to perform the casts so,
to allow mixed narrow types, the handling of these instructions is
now slightly different:

- we allow casts if their result or operand is <= TypeSize.
- zexts are sinks if their result > TypeSize.
- truncs are still sinks if their operand == TypeSize.
- truncs are still sources if their result == TypeSize.

The transformation bails on finding an icmp that operates on data
smaller than the current TypeSize.

Differential Revision: https://reviews.llvm.org/D54108

llvm-svn: 346480
2018-11-09 09:28:27 +00:00
Mandeep Singh Grang 397765bc51 [COFF, ARM64] Add support for MSVC buffer security check
Reviewers: rnk, mstorsjo, compnerd, efriedma, TomTan

Reviewed By: rnk

Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D54248

llvm-svn: 346469
2018-11-09 02:48:36 +00:00
Thomas Lively 38c902bc2e [WebAssembly] Lower select for vectors
Summary:

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53675

llvm-svn: 346462
2018-11-09 01:38:44 +00:00
Heejin Ahn 0c68a875fa [WebAssembly] Fix LowerEmscriptenEHSjLj when there's only longjmp
Summary:
The pass incorrectly assumed if there's a longjmp declaration in the
module, there is also a setjmp function declaration. Fixed it, and now
the pass only converts longjmp and does not do any other transformation
when there's no setjmp declaration in the module.

Fixes PR39562.

Reviewers: jgravelle-google, sbc100

Subscribers: dschuff, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54273

llvm-svn: 346445
2018-11-08 22:56:26 +00:00
Simon Pilgrim 0b01062dba [X86] Regenerate loaduse test
llvm-svn: 346434
2018-11-08 19:42:11 +00:00
Sanjay Patel b5535dc7b3 [x86] use shuffles for scalar insertion into high elements of a constant vector
As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly.

Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm:

// If the vector is wider than 128 bits, extract the 128-bit subvector, insert
// into that, and then insert the subvector back into the result.

...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering.

Differential Revision: https://reviews.llvm.org/D54271

llvm-svn: 346433
2018-11-08 19:16:27 +00:00
Nirav Dave 6ce9f72f76 [DAGCombine] Improve alias analysis for chain of independent stores.
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk

Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53552

llvm-svn: 346432
2018-11-08 19:14:20 +00:00
Sanjay Patel c4f719feb0 [x86] add RUNs for AVX1; NFC
Differences in splat-ability might be reason to differentiate some cases.

llvm-svn: 346426
2018-11-08 18:18:20 +00:00
Nicolai Haehnle 6979b70427 Add test case for the regression caused by r344696
(That change has since been reverted.)

Reduced from https://bugs.freedesktop.org/show_bug.cgi?id=108611

llvm-svn: 346423
2018-11-08 18:01:38 +00:00
Davide Italiano ac8279ab8b Revert "[MSP430] Add MC layer"
This commit broke the module buildbots.
Error:

lib/Target/MSP430/MSP430GenAsmMatcher.inc:1027:1: error: redundant
namespace 'llvm' [-Wmodules-import-nested-redundant]
^

llvm-svn: 346410
2018-11-08 16:21:29 +00:00
Jonas Paulsson 1993894c03 [SystemZ] Bugfix in shouldCoalesce()
It was discovered in randomized testing that the SystemZ implementation of
shouldCoalesce() could be caused to crash when subreg liveness was
enabled. This was because an undef use of the virtual register was copied
outside current MBB at the point of shouldCoalesce() being called. For more
details, see https://bugs.llvm.org/show_bug.cgi?id=39276.

This patch changes the check for MBB locality from livein/liveout checks to
do checks for all instructions of both intervals being inside MBB. This
avoids the cases with dead defs / undef uses outside MBB, which are not
affecting liveness in/out of MBB.

The original test case included as a reduced .mir test case.

Review: Ulrich Weigand
https://reviews.llvm.org/D54197

llvm-svn: 346406
2018-11-08 15:29:48 +00:00
Simon Pilgrim b917740ac3 [X86][SSE] Add PR39387 shuffle test case
llvm-svn: 346402
2018-11-08 14:07:17 +00:00
Petr Pavlu 7c84b2e3ab [ARM] Enable spilling of the hGPR register class in Thumb2
Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and
loadRegToStackSlot() to allow the GPR class or any of its sub-classes
(including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12.

Differential Revision: https://reviews.llvm.org/D51927

llvm-svn: 346401
2018-11-08 13:02:10 +00:00
Simon Pilgrim 1ef4af5278 [X86][AVX] Tidyup prefixes and regenerate interleaved tests
Share common AVX prefix and split off AVX2OR512 prefix instead

llvm-svn: 346399
2018-11-08 12:14:10 +00:00
Thomas Lively 897171902b [WebAssembly] Add V128 to WebAssemblyInstrInfo::copyPhysReg
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53872

llvm-svn: 346384
2018-11-08 02:35:28 +00:00
Stanislav Mekhanoshin 6cc8b2fc65 [AMDGPU] Extend promote alloca vectorization
Promote alloca can vectorize a small array by bitcasting it to a
vector type. Extend vectorization for the case when alloca is
already a vector type. We still want to replace GEPs with an
insert/extract element instructions in this case.

Differential Revision: https://reviews.llvm.org/D54219

llvm-svn: 346376
2018-11-08 00:16:23 +00:00
Anton Korobeynikov 09dff53840 [MSP430] Add MC layer
Summary:
This change implements assembler parser, code emitter, ELF object writer
and disassembler for the MSP430 ISA.  Also, more instruction forms are added
to the target description.

Reviewers: asl

Reviewed By: asl

Subscribers: pftbest, krisb, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D53661

llvm-svn: 346374
2018-11-08 00:03:45 +00:00
Daniel Sanders 162da63e05 Add 'REQUIRES: default_triple' to test/CodeGen/MIR/X86/zero-probability.mir
llvm-svn: 346368
2018-11-07 23:33:55 +00:00
Nicolai Haehnle bc233f5523 Revert "AMDGPU: Divergence-driven selection of scalar buffer load intrinsics"
This reverts commit r344696 for now (except for some test additions).

See https://bugs.freedesktop.org/show_bug.cgi?id=108611.

llvm-svn: 346364
2018-11-07 21:53:43 +00:00
Eli Friedman d00fb2e0a8 [AArch64] [Windows] Trap after noreturn calls.
Like the comment says, this isn't the most efficient fix in terms of
codesize, but it works.

Differential Revision: https://reviews.llvm.org/D54129

llvm-svn: 346358
2018-11-07 21:31:14 +00:00
Eli Friedman 7d7d41debc [ARM] Fix CPSR liveness in tMOVCCr_pseudo lowering.
The lowering was missing live-ins in certain cases, like a sequence of
multiple tMOVCCr_pseudo instructions.  This would lead to a verifier
failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered.

For reasons I don't completely understand, it's hard to get a sequence
of multiple tMOVCCr_pseudo instructions; the issue only seems to show up
with 64-bit comparisons where the result is zero-extended. I added some
extra testcases in case that changes in the future. Probably some
optimization opportunities here if anyone is interested. (@test_slt_not
is the case that was getting miscompiled.)

The code to check the liveness of CPSR was stolen from
X86ISelLowering.cpp; maybe it could be refactored into common helper,
but I have no idea where to put it.

Differential Revision: https://reviews.llvm.org/D54192

llvm-svn: 346355
2018-11-07 21:08:13 +00:00
Matt Arsenault 8ba740a5a8 Allow subclassing ExternalAA
This allows testing AMDGPU alias analysis like any
other alias analysis pass. This fixes the existing
test pointlessly running opt -O3 when it really
just wants to run the one analysis.

Before there was no way to test this using -aa-eval
with opt, since the default constructed pass
is run. The wrapper subclass allows the
default constructor to pass the necessary callback.

llvm-svn: 346353
2018-11-07 20:26:42 +00:00
Than McIntosh 5bcdea5118 [X86] improve split-stack machine BB placement
Summary:
The conditional branch created to support -fsplit-stack for X86 is
left unbiased/unhinted, resulting in less than ideal block placement:
the __morestack call block is kept on the main hot path. Bias the
branch to insure that the stack allocation block is treated as a
"cold" block during machine basic block placement.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54123

llvm-svn: 346336
2018-11-07 17:41:57 +00:00
James Y Knight 5fbc72f526 Workaround PPC backend bug in test for r346322.
It seems that the PPC backend croaks when lowering a call to a
function with an argument of type [2 x i32].

Just modify the type slightly to avoid this -- I wasn't actually
intending to stress test the backend...

llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6172: llvm::SDValue llvm::PPCTargetLowering::LowerCall_64SVR4(...): Assertion `(!HasParameterArea || NumBytesActuallyUsed == ArgOffset) && "mismatch in size of parameter area"' failed.

llvm-svn: 346334
2018-11-07 17:01:47 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Petar Avramovic 2624c8db68 [MIPS GlobalISel] Set operand order for G_MERGE and G_UNMERGE
Set operands order for G_MERGE_VALUES and G_UNMERGE_VALUES so
that least significant bits always go first, regardless of endianness.

Differential Revision: https://reviews.llvm.org/D54098

llvm-svn: 346305
2018-11-07 11:45:43 +00:00
Matthias Braun 5b7c90b4e2 RegAllocFast: Leave unassigned virtreg entries in map
Set `LiveReg::PhysReg` to zero when freeing a register instead of
removing it from the entry from `LiveRegMap`. This way no iterators get
invalidated and we can avoid passing around and updating iterators all
over the place.

This does not change any allocator decisions. It is not completely NFC
because the arbitrary iteration order through `LiveRegMap` in
`spillAll()` changes so we may get a different order in those spill
sequences (the amount of spills does not change).

This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346298
2018-11-07 06:57:03 +00:00
Yaxun Liu 73bf0af32f AMDGPU: Add an option -disable-promote-alloca-to-lds
Add this option for debugging and providing workaround.

By default it is off so no behavior change in backend.

Differential Revision: https://reviews.llvm.org/D54158

llvm-svn: 346267
2018-11-06 21:28:17 +00:00
Craig Topper 6428a2cd9a [X86] Add custom promotion of v2i8/v2i16 fp_to_sint to avoid over promotion to v2i64 which would force scalarization.
llvm-svn: 346259
2018-11-06 19:24:21 +00:00