Commit Graph

57361 Commits

Author SHA1 Message Date
Craig Topper a1b6667c6a [X86] Use a MOVSX instruction instead of a MOVZX instruction in isel for an any_extend of the remainder from an 8-bit sdivrem.
The sdivrem will emit its own MOVSX to move %ah to the low byte of a register. By using a MOVSX for an any_extend this allows a post-isel peephole to merge them.

llvm-svn: 346581
2018-11-10 06:04:33 +00:00
Craig Topper dc12535e00 [X86] Add a test case to show scalarized vector srem to demonstrate unnecessary instructions. NFC
After the division %ah is being sign extended to move it to lower byte of a register while avoiding a partial register read. We then zero extend the low byte to the full 32 bit register. But we don't use any of the zero extended bits. In the DAG the zero extend was really an any_extend so the sign extend should have been enough.

llvm-svn: 346580
2018-11-10 06:04:09 +00:00
Matthias Braun 0261d6e36a test/CodeGen/X86: Relax test case
No need to hardcode register or expecting totally unnecessary spills
from the allocator.

llvm-svn: 346575
2018-11-10 00:34:09 +00:00
Craig Topper 0364085281 [X86] In LowerHorizontalByteSum, emit vector_shuffle nodes instead of directly using X86ISD::UNPCKL/X86ISD::UNPCKH.
This gives shuffle lowering the freedom to use zero_extend_vector_inreg for the unpckl shuffle. Shuffle combining usually makes this swap later, but not when AVX512 is enabled it seems.

While there also use DAG.getConstant to create a 0 vector instead of using the helper the forces a specific BUILD_VECTOR. I don't think that helper is usually needed. We're basically free to create a constant build_vector anytime and it will be legalized on its own.

llvm-svn: 346574
2018-11-10 00:26:42 +00:00
Eli Friedman ad1151cf6a [ARM64] [Windows] Handle funclets
This patch adds support for funclets in frame lowering and ISel
lowering. Together with D50288 and D50166, it enables C++ exception
handling.

Patch by Sanjin Sijaric, with some fixes by me.

Differential Revision: https://reviews.llvm.org/D51524

llvm-svn: 346568
2018-11-09 23:33:30 +00:00
Dylan McKay 6fddb53685 [AVR] Reorder the CHECK lines in directmem.ll to match current trunk
In r346432 ("[DAGCombine] Improve alias analysis for chain of independent stores"),
the order of ldi/sts blocks changed.

The new IR is equivalent to the old IR.

This patch updates the test to fix the test suite.

llvm-svn: 346565
2018-11-09 23:17:59 +00:00
Eli Friedman 0bbb0d0720 [ARM] Add MemOperand to LDRcp to enable DCE.
LDRcp should be deleted when the dest register is dead in register
coalescing. Without MemOp, dead LDRcp will cause dead constant pool
value which references to non-existing label.

Patch by Yin Ma.

Differential Revision: https://reviews.llvm.org/D54173

llvm-svn: 346563
2018-11-09 23:09:17 +00:00
Eli Friedman 15930bf352 [JumpThreading] Fix exponential time algorithm computing known values.
ComputeValueKnownInPredecessors has a "visited" set to prevent infinite
loops, since a value can be visited more than once.  However, the
implementation didn't prevent the algorithm from taking exponential
time. Instead of removing elements from the RecursionSet one at a time,
we should keep around the whole set until
ComputeValueKnownInPredecessors finishes, then discard it.

The testcase is synthetic because I was having trouble effectively
reducing the original.  But it's basically the same idea.

Instead of failing, we could theoretically cache the result instead.
But I don't think it would help substantially in practice.

Differential Revision: https://reviews.llvm.org/D54239

llvm-svn: 346562
2018-11-09 22:35:26 +00:00
Thomas Lively 97bef83690 [WebAssembly] Disable custom NaN payload tests
Summary:
These tests fail on 32-bit builds because NaN payload bits in floating point
immediates are not necessarily preserved through compilation. This is because
the MC layer uses native doubles to store these values. The tests will be
reenabled once this problem has been fixed or deleted if we decide we don't care
about lowering payload bits.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54353

llvm-svn: 346558
2018-11-09 22:04:37 +00:00
Craig Topper 17d64c71c5 [X86] Move the promotion of v16i16->v16i8 for avx512f but not avx512bw from lowering to isel. Change to use vpmovzx instead of vpmovsx.
With avx512f but not avx512bw we need to extend to v16i32 then truncate that to to v16i8. Previously we emitted both nodes during lowering, but I'm trying to switch to using target independent nodes and with that switched the extend+truncate wou

This patch changes the implementation to what will be necessary with that patch which helps minimize test diffs.

llvm-svn: 346552
2018-11-09 20:09:53 +00:00
Bryan Chan 123553921f [AArch64] Support HiSilicon's TSV110 processor
Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls

Reviewed By: kristof.beyls

Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D53908

llvm-svn: 346546
2018-11-09 19:32:08 +00:00
Nico Weber dfc08baceb [MS demangler] Use a slightly shorter unmangling for mangled strings.
Before: const wchar_t * {L"%"}
Now: L"%"

See also PR39593.
Differential Revision: https://reviews.llvm.org/D54294

llvm-svn: 346544
2018-11-09 19:28:50 +00:00
Ulrich Weigand e32d129712 [SystemZ] Add a couple of missing tests
A few fp128 tests were omitted from test/CodeGen/SystemZ/fp-round-01.ll
since in early days, LLVM couldn't handle implicitly generated library
calls to functions with long double arguments on SystemZ.

This deficiency was actually long since fixed, but those tests are
still missing.  This patch adds the missing tests.  NFC.

llvm-svn: 346541
2018-11-09 19:16:21 +00:00
Paul Robinson ddbde9a4ad [DWARFv5] Emit normal type units in .debug_info comdats.
Differential Revision: https://reviews.llvm.org/D54282

llvm-svn: 346540
2018-11-09 19:06:09 +00:00
Craig Topper 731ea7dbc1 [X86] Turn X86ISD::VSEXT into X86ISD::VZEXT if the upper bits aren't demanded.
This makes X86ISD::VSEXT more similar to ISD::SIGN_EXTEND and ISD::ZERO_EXTEND.

I'm hoping to replace X86ISD::VSEXT/VZEXT with target independent nodes. Making the target specific nodes similar to the target independent nodes helps minimize test diffs in that patch.

llvm-svn: 346539
2018-11-09 19:05:51 +00:00
Simon Pilgrim fc8f1d7da7 [CostModel][X86] SK_ExtractSubvector is free if the subvector is at the start of the source vector
llvm-svn: 346538
2018-11-09 19:04:27 +00:00
Stanislav Mekhanoshin 26299e2af1 [AMDGPU] Cleanup optimize-if-exec-masking.mir test. NFC.
llvm-svn: 346533
2018-11-09 18:23:39 +00:00
Brendon Cahoon ac8fed68d5 [Hexagon] Implement noreturn optimization
Eliminate the stack frame in functions with the noreturn nounwind
attributes, and when the noreturn-stack-elim target feature is
enabled. This reduces the code and stack space needed for noreturn
functions.

Differential Revision: https://reviews.llvm.org/D54210

llvm-svn: 346532
2018-11-09 18:16:24 +00:00
Greg Clayton 44487b655d Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics"
Differential Revision: https://reviews.llvm.org/D54217

llvm-svn: 346531
2018-11-09 18:10:02 +00:00
Craig Topper 9a7e19b8f2 [DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.

I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.

Differential Revision: https://reviews.llvm.org/D54283

llvm-svn: 346530
2018-11-09 18:04:34 +00:00
Jordan Rupprecht dcf1f8e716 [llvm-strings] Fix whitespaces to match strings output.
Summary:
The current implementation prepends a space on every line, making it difficult to compare against GNU strings.

The space appears to have come from handling --radix in rL292707. The space is for making sure there's a space between the radix and the value; however the space is still emitted even when there is no radix. This change fixes that so the space is only emitted when there is a radix.

Reviewers: jhenderson

Reviewed By: jhenderson

Subscribers: llvm-commits, compnerd

Differential Revision: https://reviews.llvm.org/D54238

llvm-svn: 346529
2018-11-09 18:03:21 +00:00
Krzysztof Parzyszek 8567de0871 [Hexagon] Place globals with explicit .sdata section in small data
Both -fPIC and -G0 disable placement of globals in small data section,
but if a global has an explicit section assigmnent placing it in small
data, it should go there anyway.

llvm-svn: 346523
2018-11-09 17:31:22 +00:00
Zaara Syeda 5c179bf14b [Power9] Allow gpr callee saved spills in prologue to vectors registers
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.

Differential Revision: https://reviews.llvm.org/D39386

llvm-svn: 346512
2018-11-09 16:36:24 +00:00
Simon Pilgrim d0c71609c5 [CostModel] Add SK_ExtractSubvector handling to getInstructionThroughput (PR39368)
Add ShuffleVectorInst::isExtractSubvectorMask helper to match shuffle masks.

llvm-svn: 346510
2018-11-09 16:28:19 +00:00
Alexey Bataev 93d018a916 Revert "[DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or only debug directives are requested."
This reverts commit r345972. Need to update the description + possibly
to update the patch itself after discussion with Eric Christofer.

llvm-svn: 346508
2018-11-09 16:22:35 +00:00
Max Moroz b2091c930b [llvm-cov] Add lcov tracefile export format.
Summary:
lcov tracefiles are used by various coverage reporting tools and build
systems (e.g., Bazel). It is a simple text-based format to parse and
more convenient to use than the JSON export format, which needs
additional processing to map regions/segments back to line numbers.

It's a little unfortunate that "text" format is now overloaded to refer
specifically to JSON for export, but I wanted to avoid making any
breaking changes to the UI of the llvm-cov tool at this time.

Patch by Tony Allevato (@allevato).

Reviewers: Dor1s, vsk

Reviewed By: Dor1s, vsk

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D54266

llvm-svn: 346506
2018-11-09 16:10:44 +00:00
Jonas Paulsson 458b7c0b39 [SystemZ] Avoid inserting same value after replication
A minor improvement of buildVector() that skips creating an
INSERT_VECTOR_ELT for a Value which has already been used for the
REPLICATE.

Review: Ulrich Weigand
https://reviews.llvm.org/D54315

llvm-svn: 346504
2018-11-09 15:44:28 +00:00
Nicolai Haehnle d5199e39c0 AMDGPU: Add testcase to demonstrate a condition with pre-existing waitcnt
Relevant for https://reviews.llvm.org/D54226.

llvm-svn: 346501
2018-11-09 15:13:12 +00:00
Sam Parker 2804f32ec4 [ARM] Don't promote i1 types in ARM CGP
Now that we have mixed type sizes, i1 values need to be explicitly
handled as we want to avoid promoting these values.

Differential Revision: https://reviews.llvm.org/D54308

llvm-svn: 346499
2018-11-09 15:06:33 +00:00
Sanjay Patel fa1c0fe478 [x86] try to form broadcast before widening shuffle elements
I noticed that we weren't generating broadcasts as much I thought we would with 
D54271, and this is part of the problem.

Widening the shuffle elements means adding bitcasts and hiding the relationship 
between a splatted scalar and the vector. If we can form a broadcast, do that 
before going through the rest of the shuffle lowering because broadcasts should 
be cheap and can often be load-folded.

Differential Revision: https://reviews.llvm.org/D54280

llvm-svn: 346498
2018-11-09 14:54:58 +00:00
Alex Bradbury 1cc2d0b9fb [RISCV] Avoid unnecessary XOR for seteq/setne 0
Differential Revision: https://reviews.llvm.org/D53492

Patch by James Clarke.

llvm-svn: 346497
2018-11-09 14:47:36 +00:00
Alex Bradbury a9da1de5f4 [RISCV] Update test/CodeGen/RISCV/calling-conv.ll after rL346432
The DAGCombiner changes led to a different schedule.

llvm-svn: 346496
2018-11-09 14:35:44 +00:00
Krzysztof Parzyszek f740fd647a [Hexagon] Handle Hexagon's SHF_HEX_GPREL section flag
llvm-svn: 346494
2018-11-09 14:17:27 +00:00
Florian Hahn 9f878e9bae Revert r346483: [CallSiteSplitting] Only record conditions up to the IDom(call site).
This cause a failure with EXPENSIVE_CHECKS

llvm-svn: 346492
2018-11-09 13:28:58 +00:00
Florian Hahn a1062f4b68 [IPSCCP,PM] Preserve DT in the new pass manager.
After D45330, Dominators are required for IPSCCP and can be preserved.

This patch preserves DominatorTreeAnalysis in the new pass manager. AFAIK the legacy pass manager cannot preserve function analysis required by a module analysis.

Reviewers: davide, dberlin, chandlerc, efriedma, kuhar, NutshellySima

Reviewed By: chandlerc, kuhar, NutshellySima

Differential Revision: https://reviews.llvm.org/D47259

llvm-svn: 346486
2018-11-09 11:52:27 +00:00
Alexandros Lamprineas e15c982f6d [SelectionDAG] swap select_cc operands to enable folding
The DAGCombiner tries to SimplifySelectCC as follows:

  select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)

It can't cope with the situation of reordered operands:

  select_cc(x, y, 0, 16, cc)

In that case we just need to swap the operands and invert the Condition Code:

  select_cc(x, y, 16, 0, ~cc)

Differential Revision: https://reviews.llvm.org/D53236

llvm-svn: 346484
2018-11-09 11:09:40 +00:00
Florian Hahn 52578f95c9 [CallSiteSplitting] Only record conditions up to the IDom(call site).
We can stop recording conditions once we reached the immediate dominator
for the block containing the call site. Conditions in predecessors of the
that node will be the same for all paths to the call site and splitting
is not beneficial.

This patch makes CallSiteSplitting dependent on the DT anlysis. because
the immediate dominators seem to be the easiest way of finding the node
to stop at.

I had to update some exiting tests, because they were checking for
conditions that were true/false on all paths to the call site. Those
should now be handled by instcombine/ipsccp.

Reviewers: davide, junbuml

Reviewed By: junbuml

Differential Revision: https://reviews.llvm.org/D44627

llvm-svn: 346483
2018-11-09 10:23:46 +00:00
Clement Courbet e6b727e552 [X86] Fix VZEROUPPER scheduling info on SNB,HSW,BDW,SXL,SKX.
Summary:
Starting from SNB, VZEROUPPER is handled by the renamer and uses no proc resources.
After HSW, it also has zero latency.

This fixes PR35606.

To reproduce:
Uops:
  llvm-exegesis -mode=uops -opcode-name=VZEROUPPER
Latency:
  echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper' | /tmp/llvm-exegesis -mode=latency -snippets-file=-
  echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper\naddps %xmm0, %xmm1' | /tmp/llvm-exegesis -mode=latency -snippets-file=-

Reviewers: RKSimon, craig.topper, andreadb

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D54107

llvm-svn: 346482
2018-11-09 09:49:06 +00:00
Carlos Alberto Enciso fa9cf89734 [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG.
In SimplifyCFG when given a conditional branch that goes to BB1 and BB2, the hoisted common terminator instruction in the two blocks, caused debug line records associated with subsequent select instructions to become ambiguous. It causes the debugger to display unreachable source lines.

Differential Revision: https://reviews.llvm.org/D53390

llvm-svn: 346481
2018-11-09 09:42:10 +00:00
Sam Parker 08979cd125 [ARM] Enable mixed types in ARM CGP
Previously, during the search, all values had to have the same
'TypeSize', which is equal to number of bits of the integer type of
the icmp operand. All values in the tree had to match this size;
meaning that, if we searched from i16, we wouldn't accept i8s. A
change in type size requires zext and truncs to perform the casts so,
to allow mixed narrow types, the handling of these instructions is
now slightly different:

- we allow casts if their result or operand is <= TypeSize.
- zexts are sinks if their result > TypeSize.
- truncs are still sinks if their operand == TypeSize.
- truncs are still sources if their result == TypeSize.

The transformation bails on finding an icmp that operates on data
smaller than the current TypeSize.

Differential Revision: https://reviews.llvm.org/D54108

llvm-svn: 346480
2018-11-09 09:28:27 +00:00
Petr Hosek e2f6896eef [llvm-rc] Support joined or separate spelling for /fo flag
CMake invokes rc using the joined spelling which appears to be supported
by Microsoft's rc implementation, so we should support it as well.

Differential Revision: https://reviews.llvm.org/D54191

llvm-svn: 346470
2018-11-09 03:16:53 +00:00
Mandeep Singh Grang 397765bc51 [COFF, ARM64] Add support for MSVC buffer security check
Reviewers: rnk, mstorsjo, compnerd, efriedma, TomTan

Reviewed By: rnk

Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D54248

llvm-svn: 346469
2018-11-09 02:48:36 +00:00
Thomas Lively 2faf079494 [WebAssembly] Read prefixed opcodes as ULEB128s
Summary: Depends on D54126.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54138

llvm-svn: 346465
2018-11-09 01:57:00 +00:00
Thomas Lively 299d214aba [WebAssembly] Renumber and LEB128-encode SIMD opcodes
Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54126

llvm-svn: 346463
2018-11-09 01:45:56 +00:00
Thomas Lively 38c902bc2e [WebAssembly] Lower select for vectors
Summary:

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53675

llvm-svn: 346462
2018-11-09 01:38:44 +00:00
Petr Hosek 1f597e6e6b [llvm-rc] Support absolute filenames in manifests
CMake generate manifests that contain absolute filenames and these
currently result in assertion error. This change ensures that we
handle these correctly.

Differential Revision: https://reviews.llvm.org/D54194

llvm-svn: 346450
2018-11-08 23:45:00 +00:00
Heejin Ahn 0c68a875fa [WebAssembly] Fix LowerEmscriptenEHSjLj when there's only longjmp
Summary:
The pass incorrectly assumed if there's a longjmp declaration in the
module, there is also a setjmp function declaration. Fixed it, and now
the pass only converts longjmp and does not do any other transformation
when there's no setjmp declaration in the module.

Fixes PR39562.

Reviewers: jgravelle-google, sbc100

Subscribers: dschuff, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54273

llvm-svn: 346445
2018-11-08 22:56:26 +00:00
Florian Hahn a684a99441 [LoopInterchange] Support reductions across inner and outer loop.
This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.

With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.

Fixes https://bugs.llvm.org/show_bug.cgi?id=30472

Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43245

llvm-svn: 346438
2018-11-08 20:44:19 +00:00
Pirama Arumuga Nainar e61652a384 [LTO] Drop non-prevailing definitions only if linkage is not local or appending
Summary:
This fixes PR 37422

In ELF, non-weak symbols can also be non-prevailing.  In this particular
PR, the __llvm_profile_* symbols are non-prevailing but weren't getting
dropped - causing multiply-defined errors with lld.

Also add a test, strong_non_prevailing.ll, to ensure that multiple
copies of a strong symbol are dropped.

To fix the test regressions exposed by this fix,
- do not mark prevailing copies for symbols with 'appending' linkage.
There's no one prevailing copy for such symbols.
- fix the prevailing version in dead-strip-fulllto.ll
- explicitly pass exported symbols to llvm-lto in fumcimport.ll and
funcimport_var.ll

Reviewers: tejohnson, pcc

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith,
dang, srhines, llvm-commits

Differential Revision: https://reviews.llvm.org/D54125

llvm-svn: 346436
2018-11-08 20:10:07 +00:00
Simon Pilgrim 0b01062dba [X86] Regenerate loaduse test
llvm-svn: 346434
2018-11-08 19:42:11 +00:00
Sanjay Patel b5535dc7b3 [x86] use shuffles for scalar insertion into high elements of a constant vector
As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly.

Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm:

// If the vector is wider than 128 bits, extract the 128-bit subvector, insert
// into that, and then insert the subvector back into the result.

...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering.

Differential Revision: https://reviews.llvm.org/D54271

llvm-svn: 346433
2018-11-08 19:16:27 +00:00
Nirav Dave 6ce9f72f76 [DAGCombine] Improve alias analysis for chain of independent stores.
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk

Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53552

llvm-svn: 346432
2018-11-08 19:14:20 +00:00
Sanjay Patel c4f719feb0 [x86] add RUNs for AVX1; NFC
Differences in splat-ability might be reason to differentiate some cases.

llvm-svn: 346426
2018-11-08 18:18:20 +00:00
Roman Lebedev 3817292069 [NFC][BdVer2] Load and store throughput tests: also check sched stats (PR39465)
As noted by Andrea Di Biagio in https://bugs.llvm.org/show_bug.cgi?id=39465
both the loads and stores occupy both the store and load queues.
This is clearly wrong.

llvm-svn: 346425
2018-11-08 18:15:58 +00:00
Nicolai Haehnle 6979b70427 Add test case for the regression caused by r344696
(That change has since been reverted.)

Reduced from https://bugs.freedesktop.org/show_bug.cgi?id=108611

llvm-svn: 346423
2018-11-08 18:01:38 +00:00
Tom Stellard 28d662164d InstCombine: Avoid introducing poison values when lowering llvm.amdgcn.[us]bfe
Summary:
When the 3rd argument to these intrinsics is zero, lowering them
to shift instructions produces poison values, since we end up with
shift amounts equal to the number of bits in the shifted value.  This
means we can only lower these intrinsics if we can prove that the
3rd argument is not zero.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: bnieuwenhuizen, jvesely, wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D53739

llvm-svn: 346422
2018-11-08 17:57:57 +00:00
Vedant Kumar d6699423f1 [CodeExtractor] Mark functions noreturn when applicable
This eliminates the outlining penalty for llvm.trap/unreachable, because
callers no longer have to emit cleanup/ret instructions after calling an
outlined `noreturn` function.

rdar://45523626

llvm-svn: 346421
2018-11-08 17:57:09 +00:00
Adrian Prantl 778fba3188 [dsymutil] Copy the LC_BUILD_VERSION load command into the companion binary.
LC_BUILD_VERSION contains platform information that is useful for LLDB
to match up dSYM bundles with binaries. This patch copies the load
command over into the dSYM.

rdar://problem/44145175
rdar://problem/45883463

Differential Revision: https://reviews.llvm.org/D54233

llvm-svn: 346412
2018-11-08 16:54:59 +00:00
Davide Italiano ac8279ab8b Revert "[MSP430] Add MC layer"
This commit broke the module buildbots.
Error:

lib/Target/MSP430/MSP430GenAsmMatcher.inc:1027:1: error: redundant
namespace 'llvm' [-Wmodules-import-nested-redundant]
^

llvm-svn: 346410
2018-11-08 16:21:29 +00:00
Jonas Paulsson 1993894c03 [SystemZ] Bugfix in shouldCoalesce()
It was discovered in randomized testing that the SystemZ implementation of
shouldCoalesce() could be caused to crash when subreg liveness was
enabled. This was because an undef use of the virtual register was copied
outside current MBB at the point of shouldCoalesce() being called. For more
details, see https://bugs.llvm.org/show_bug.cgi?id=39276.

This patch changes the check for MBB locality from livein/liveout checks to
do checks for all instructions of both intervals being inside MBB. This
avoids the cases with dead defs / undef uses outside MBB, which are not
affecting liveness in/out of MBB.

The original test case included as a reduced .mir test case.

Review: Ulrich Weigand
https://reviews.llvm.org/D54197

llvm-svn: 346406
2018-11-08 15:29:48 +00:00
Roman Lebedev 2ad16b9371 [NFC][BdVer2] Tests for load and store throughput (PR39465)
During review it was noted that while it appears that
the Piledriver can do two [consecutive] loads per cycle,
it can only do one store per cycle. It was suggested
that the sched model incorrectly models that,
but it was opted to fix this afterwards.

These tests show that the two consecutive loads are
modelled correctly, and one consecutive stores is not
modelled incorrectly. Unless i'm missing the point.

https://bugs.llvm.org/show_bug.cgi?id=39465

llvm-svn: 346404
2018-11-08 14:48:56 +00:00
Simon Pilgrim b917740ac3 [X86][SSE] Add PR39387 shuffle test case
llvm-svn: 346402
2018-11-08 14:07:17 +00:00
Petr Pavlu 7c84b2e3ab [ARM] Enable spilling of the hGPR register class in Thumb2
Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and
loadRegToStackSlot() to allow the GPR class or any of its sub-classes
(including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12.

Differential Revision: https://reviews.llvm.org/D51927

llvm-svn: 346401
2018-11-08 13:02:10 +00:00
Simon Pilgrim 1ef4af5278 [X86][AVX] Tidyup prefixes and regenerate interleaved tests
Share common AVX prefix and split off AVX2OR512 prefix instead

llvm-svn: 346399
2018-11-08 12:14:10 +00:00
Max Kazantsev 266c087b9d Return "[IndVars] Smart hard uses detection"
The patch has been reverted because it ended up prohibiting propagation
of a constant to exit value. For such values, we should skip all checks
related to hard uses because propagating a constant is always profitable.

Differential Revision: https://reviews.llvm.org/D53691

llvm-svn: 346397
2018-11-08 11:54:35 +00:00
Gil Rapaport 7b88bab386 [LSR] Combine unfolded offset into invariant register
LSR reassociates constants as unfolded offsets when the constants fit as
immediate add operands, which currently prevents such constants from being
combined later with loop invariant registers.
This patch modifies GenerateCombinations() to generate a second formula which
includes the unfolded offset in the combined loop-invariant register.

This commit fixes a bug in the original patch (committed at r345114, reverted
at r345123).

Differential Revision: https://reviews.llvm.org/D51861

llvm-svn: 346390
2018-11-08 09:01:19 +00:00
whitequark 73cb978495 [MergeFuncs] Improve ordering of equal functions
Summary:
MergeFunctions currently tries to process strong functions before
weak functions, because weak functions can simply call strong
functions, while a strong/weak function cannot call a weak function
(a backing strong function is needed).

This patch additionally tries to process external functions before
local functions, because we definitely have to keep the external
function, but may be able to drop the local one (and definitely
can if it is also unnamed_addr).

Unfortunately, this exposes an existing bug in the implementation:
The FnTree and FNodesInTree structures can currently go out of
sync in the case where two weak functions are merged, because the
function in FnTree/FNodesInTree is RAUWed. This leaves it behind in
FnTree (this is intended, as it is the strong backing function which
should be used for further merges), while it is replaced in
FNodesInTree (this is not intended).

This is fixed by switching FNodesInTree from using a ValueMap to
using a DenseMap of AssertingVH.

This exposes another minor issue: Currently FNodesInTree is not
cleared after MergeFunctions finishes running. Currently, this is
potentially dangerous (e.g. if something else wants to RAUW a function
with a non-function), but at the very least it is unnecessary/inefficient.
After the change to use AssertingVH it becomes more problematic,
because there are certainly passes that remove functions.

This issue is fixed by clearing FNodesInTree at the end of the pass.

Reviewers: jfb, whitequark

Reviewed By: whitequark

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D53271

llvm-svn: 346386
2018-11-08 03:58:01 +00:00
whitequark 3580ac6125 [MergeFuncs] Call removeUsers() prior to unnamed_addr RAUW
Summary:
For unnamed_addr functions we RAUW instead of only replacing direct callers. However, functions in which replacements were performed currently are not added back to the worklist, resulting in missed merging opportunities.

Fix this by calling removeUsers() prior to RAUW.

Reviewers: jfb, whitequark

Reviewed By: whitequark

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D53262

llvm-svn: 346385
2018-11-08 03:57:55 +00:00
Thomas Lively 897171902b [WebAssembly] Add V128 to WebAssemblyInstrInfo::copyPhysReg
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53872

llvm-svn: 346384
2018-11-08 02:35:28 +00:00
Reid Kleckner b41b372171 [sancov] Put .SCOV* sections into the right comdat groups on COFF
Avoids linker errors about relocations against discarded sections.

This was uncovered during the Chromium clang roll here:
https://chromium-review.googlesource.com/c/chromium/src/+/1321863#message-717516acfcf829176f6a2f50980f7a4bdd66469a

After this change, Chromium's libGLESv2 links successfully for me.

Reviewers: metzman, hans, morehouse

Differential Revision: https://reviews.llvm.org/D54232

llvm-svn: 346381
2018-11-08 00:57:33 +00:00
Stanislav Mekhanoshin 6cc8b2fc65 [AMDGPU] Extend promote alloca vectorization
Promote alloca can vectorize a small array by bitcasting it to a
vector type. Extend vectorization for the case when alloca is
already a vector type. We still want to replace GEPs with an
insert/extract element instructions in this case.

Differential Revision: https://reviews.llvm.org/D54219

llvm-svn: 346376
2018-11-08 00:16:23 +00:00
Anton Korobeynikov 09dff53840 [MSP430] Add MC layer
Summary:
This change implements assembler parser, code emitter, ELF object writer
and disassembler for the MSP430 ISA.  Also, more instruction forms are added
to the target description.

Reviewers: asl

Reviewed By: asl

Subscribers: pftbest, krisb, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D53661

llvm-svn: 346374
2018-11-08 00:03:45 +00:00
Jordan Rupprecht 4f36c7ad90 [llvm-readobj] Implement LLVM style printer for --notes
Summary:
Port the GNU style printNotes method to the LLVMStyle subclass.

This is basically just a heavy refactor so that the note parsing/formatting logic from the GNUStyle::printNotes can be shared with LLVMStyle::printNotes.

Reviewers: MaskRay

Reviewed By: MaskRay

Subscribers: dschuff, fedor.sergeev, llvm-commits

Differential Revision: https://reviews.llvm.org/D54220

llvm-svn: 346371
2018-11-07 23:53:50 +00:00
Rong Xu fb4bcc452c [PGO] Exit early if all count values are zero
If all the edge counts for a function are zero, skip count population and
annotation, as nothing will happen. This can save some compile time.

Differential Revision: https://reviews.llvm.org/D54212

llvm-svn: 346370
2018-11-07 23:51:20 +00:00
Daniel Sanders 162da63e05 Add 'REQUIRES: default_triple' to test/CodeGen/MIR/X86/zero-probability.mir
llvm-svn: 346368
2018-11-07 23:33:55 +00:00
Nicolai Haehnle bc233f5523 Revert "AMDGPU: Divergence-driven selection of scalar buffer load intrinsics"
This reverts commit r344696 for now (except for some test additions).

See https://bugs.freedesktop.org/show_bug.cgi?id=108611.

llvm-svn: 346364
2018-11-07 21:53:43 +00:00
Paul Robinson 746c22389c [DWARFv5] Read and dump multiple .debug_info sections.
Type units go in .debug_info comdats, not .debug_types, in v5.

Differential Revision: https://reviews.llvm.org/D53907

llvm-svn: 346360
2018-11-07 21:39:09 +00:00
Adrian Prantl 85e71733ed Fix spelling error
llvm-svn: 346359
2018-11-07 21:34:33 +00:00
Eli Friedman d00fb2e0a8 [AArch64] [Windows] Trap after noreturn calls.
Like the comment says, this isn't the most efficient fix in terms of
codesize, but it works.

Differential Revision: https://reviews.llvm.org/D54129

llvm-svn: 346358
2018-11-07 21:31:14 +00:00
Eli Friedman 7d7d41debc [ARM] Fix CPSR liveness in tMOVCCr_pseudo lowering.
The lowering was missing live-ins in certain cases, like a sequence of
multiple tMOVCCr_pseudo instructions.  This would lead to a verifier
failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered.

For reasons I don't completely understand, it's hard to get a sequence
of multiple tMOVCCr_pseudo instructions; the issue only seems to show up
with 64-bit comparisons where the result is zero-extended. I added some
extra testcases in case that changes in the future. Probably some
optimization opportunities here if anyone is interested. (@test_slt_not
is the case that was getting miscompiled.)

The code to check the liveness of CPSR was stolen from
X86ISelLowering.cpp; maybe it could be refactored into common helper,
but I have no idea where to put it.

Differential Revision: https://reviews.llvm.org/D54192

llvm-svn: 346355
2018-11-07 21:08:13 +00:00
Matt Arsenault 8ba740a5a8 Allow subclassing ExternalAA
This allows testing AMDGPU alias analysis like any
other alias analysis pass. This fixes the existing
test pointlessly running opt -O3 when it really
just wants to run the one analysis.

Before there was no way to test this using -aa-eval
with opt, since the default constructed pass
is run. The wrapper subclass allows the
default constructor to pass the necessary callback.

llvm-svn: 346353
2018-11-07 20:26:42 +00:00
Fedor Sergeev f9a02a7006 [SimpleLoopUnswitch] partial unswitch needs to be careful when replacing invariants with constants
When partial unswitch operates on multiple conditions at once, .e.g:
   if (Cond1 || Cond2 || NonInv) ...

it should infer (and replace) values for individual conditions only on one
side of unswitch and not another.

More precisely only these derivations hold true:
   (Cond1 || Cond2) == false  =>  Cond1 == Cond2 == false
   (Cond1 && Cond2) == true   =>  Cond1 == Cond2 == true

By the way we organize unswitching it means only replacing on "continue" blocks
and never on "unswitched" ones. Since trivial unswitch does not have "unswitched"
blocks it does not have this problem.

Fixes PR 39568.

Reviewers: chandlerc, asbirlea
Differential Revision: https://reviews.llvm.org/D54211

llvm-svn: 346350
2018-11-07 20:05:11 +00:00
Mandeep Singh Grang d47d188b6f [LoopSink] Do not sink instructions into non-cold blocks
Summary: This fixes PR39570.

Reviewers: danielcdh, rnk, bkramer

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54181

llvm-svn: 346337
2018-11-07 18:26:24 +00:00
Than McIntosh 5bcdea5118 [X86] improve split-stack machine BB placement
Summary:
The conditional branch created to support -fsplit-stack for X86 is
left unbiased/unhinted, resulting in less than ideal block placement:
the __morestack call block is kept on the main hot path. Bias the
branch to insure that the stack allocation block is treated as a
"cold" block during machine basic block placement.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54123

llvm-svn: 346336
2018-11-07 17:41:57 +00:00
Florian Hahn ac86038b40 [NewGVN] Make sure we do not add a user to itself.
If we simplify an instruction to itself, we do not need to add a user to
itself. For congruence classes with a defining expression, we already
use a similar logic.

Fixes PR38259.

Reviewers: davide, efriedma, mcrosier

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D51168

llvm-svn: 346335
2018-11-07 17:20:07 +00:00
James Y Knight 5fbc72f526 Workaround PPC backend bug in test for r346322.
It seems that the PPC backend croaks when lowering a call to a
function with an argument of type [2 x i32].

Just modify the type slightly to avoid this -- I wasn't actually
intending to stress test the backend...

llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6172: llvm::SDValue llvm::PPCTargetLowering::LowerCall_64SVR4(...): Assertion `(!HasParameterArea || NumBytesActuallyUsed == ArgOffset) && "mismatch in size of parameter area"' failed.

llvm-svn: 346334
2018-11-07 17:01:47 +00:00
Sanjay Patel 57a08b3343 [InstCombine] propagate FMF for fcmp+fabs folds
By morphing the instruction rather than deleting and creating a new one,
we retain fast-math-flags and potentially other metadata (profile info?).

llvm-svn: 346331
2018-11-07 16:15:01 +00:00
Clement Courbet c544838f87 [llvm-exegesis] Correclty handle all X86 memory encoding formats.
Summary:
Add unit tests to check the support for each supported format to avoid
regressions such as the one in PR36906.

Reviewers: gchatelet

Subscribers: tschuett, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D54144

llvm-svn: 346330
2018-11-07 16:14:55 +00:00
Sanjay Patel bb521e63af [InstCombine] peek through fabs() when checking isnan()
That should be the end of the missing cases for this fold.
See earlier patches in this series:
rL346321
rL346324

llvm-svn: 346327
2018-11-07 15:44:26 +00:00
Sanjay Patel d80ec9e11a [InstCombine] add tests for isnan(fabs(X)); NFC
llvm-svn: 346325
2018-11-07 15:36:23 +00:00
Sanjay Patel fa5f146872 [InstCombine] add folds for fcmp Pred fabs(X), 0.0
Similar to rL346321, we had folds for the ordered
versions of these compares already, so add the
unordered siblings for completeness.

llvm-svn: 346324
2018-11-07 15:33:03 +00:00
Sanjay Patel 16a527e7de [InstCombine] add tests for more fcmp+fabs preds; NFC
llvm-svn: 346323
2018-11-07 15:27:02 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Sanjay Patel 76faf5145d [InstCombine] add fold for fabs(X) u< 0.0
The sibling fold for 'oge' --> 'ord' was already here,
but this half was missing. 

The result of fabs() must be positive or nan, so asking 
if the result is negative or nan is the same as asking 
if the result is nan.

This is another step towards fixing:
https://bugs.llvm.org/show_bug.cgi?id=39475

llvm-svn: 346321
2018-11-07 15:11:32 +00:00
Sanjay Patel c006a0ad4b [InstCombine] add test for fcmp+fabs; NFC
llvm-svn: 346320
2018-11-07 15:01:09 +00:00
Sanjay Patel 46a2510d01 [InstCombine] add FMF to fcmp to show failure to propagate; NFC
llvm-svn: 346317
2018-11-07 14:44:09 +00:00
Sanjay Patel 7552d0d2e6 [InstCombine] do not shrink switch conditions to illegal types (PR29009)
This patch makes shrinking switch conditions less aggressive which was introduced by:
rL274233

Note that we have 2 new bugs to track potential follow-ups that might have solved PR29009
in different ways:
https://bugs.llvm.org/show_bug.cgi?id=39569
https://bugs.llvm.org/show_bug.cgi?id=39578

Patch by:
@dendibakh (Denis Bakhvalov)

Differential Revision: https://reviews.llvm.org/D54115

llvm-svn: 346315
2018-11-07 14:12:41 +00:00
Petar Avramovic 2624c8db68 [MIPS GlobalISel] Set operand order for G_MERGE and G_UNMERGE
Set operands order for G_MERGE_VALUES and G_UNMERGE_VALUES so
that least significant bits always go first, regardless of endianness.

Differential Revision: https://reviews.llvm.org/D54098

llvm-svn: 346305
2018-11-07 11:45:43 +00:00
Matthias Braun 5b7c90b4e2 RegAllocFast: Leave unassigned virtreg entries in map
Set `LiveReg::PhysReg` to zero when freeing a register instead of
removing it from the entry from `LiveRegMap`. This way no iterators get
invalidated and we can avoid passing around and updating iterators all
over the place.

This does not change any allocator decisions. It is not completely NFC
because the arbitrary iteration order through `LiveRegMap` in
`spillAll()` changes so we may get a different order in those spill
sequences (the amount of spills does not change).

This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346298
2018-11-07 06:57:03 +00:00
Max Kazantsev 68b2ad7e63 [NFC] Add missing test case, some test renaming
llvm-svn: 346295
2018-11-07 05:58:10 +00:00
Heejin Ahn ccf35b4a76 [WebAssembly] Update more test cases after FixFunctionBitcasts
These test updates were missing from rL346286.

llvm-svn: 346291
2018-11-07 02:26:03 +00:00
Heejin Ahn 756b50ea85 [WebAssembly] Update test cases after FixFunctionBitcasts
Summary:
This updates generated binaries and corresponding test cases up to date
after applying FixFunctionBitcasts pass.

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54070

llvm-svn: 346286
2018-11-07 01:58:50 +00:00
Joel E. Denny 19f0bc825a [FileCheck] Try to fix windows bots broken by r346272
llvm-svn: 346277
2018-11-06 22:42:10 +00:00
Joel E. Denny 24994d77b8 [FileCheck] Parse command-line options from FILECHECK_OPTS
This feature makes it easy to tune FileCheck diagnostic output when
running the test suite via ninja, a bot, or an IDE.  For example:

```
$ FILECHECK_OPTS='-color -v -dump-input-on-failure' \
  LIT_FILTER='OpenMP/for_codegen.cpp' ninja check-clang \
  | less -R
```

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D53517

llvm-svn: 346272
2018-11-06 22:07:03 +00:00
Yaxun Liu 73bf0af32f AMDGPU: Add an option -disable-promote-alloca-to-lds
Add this option for debugging and providing workaround.

By default it is off so no behavior change in backend.

Differential Revision: https://reviews.llvm.org/D54158

llvm-svn: 346267
2018-11-06 21:28:17 +00:00
Teresa Johnson cb397461e1 [ThinLTO] Split NotEligibleToImport into legality and inlinability flags
Summary:
The NotEligibleToImport flag on the GlobalValueSummary was set if it
isn't legal to import (e.g. because it references unpromotable locals)
and when it can't be inlined (in which case importing is pointless).

I split out the inlinable piece into a separate flag on the
FunctionSummary (doesn't make sense for aliases or global variables),
because in the future we may want to import for reasons other than
inlining.

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53345

llvm-svn: 346261
2018-11-06 19:41:35 +00:00
Craig Topper 6428a2cd9a [X86] Add custom promotion of v2i8/v2i16 fp_to_sint to avoid over promotion to v2i64 which would force scalarization.
llvm-svn: 346259
2018-11-06 19:24:21 +00:00
Vedant Kumar 1e209e284f [CodeExtractor] Do not extract calls to eh_typeid_for (PR39545)
The lowering for a call to eh_typeid_for changes when it's moved from
one function to another.

There are several proposals for fixing this issue in llvm.org/PR39545.
Until some solution is in place, do not allow CodeExtractor to extract
calls to eh_typeid_for, as that results in serious miscompilations.

llvm-svn: 346256
2018-11-06 19:06:08 +00:00
Vedant Kumar 09b7aa443d [CodeExtractor] Erase use-without-def debug intrinsics in parent func
When CodeExtractor moves instructions to a new function, debug
intrinsics referring to those instructions within the parent function
become invalid.

This results in the same verifier failure which motivated r344545, about
function-local metadata being used in the wrong function.

llvm-svn: 346255
2018-11-06 19:05:53 +00:00
Volkan Keles fa441730bb [AArch64][GlobalISel] Simplify and autogenerate the legalizer tests
llvm-svn: 346253
2018-11-06 18:59:18 +00:00
Volkan Keles ecacfe9c6c Reland r346166: [GlobalISel] Refactor the artifact combiner a bit by using MIPatternMatch
It was causing a crash because we were trying to get the definition
of a target register. Fixed the issue by adding a check and added
a test case for that.

llvm-svn: 346251
2018-11-06 18:31:25 +00:00
Eli Friedman e3a5fc6d80 Disable calls to *_finite and other glibc-only functions on Musl.
Non-GNU environments don't have __finite_*, so treat them as
unavailable.

Differential Revision: https://reviews.llvm.org/D51282

llvm-svn: 346250
2018-11-06 18:23:32 +00:00
Derek Schuff 6881806241 [WebAssembly] Add shared memory support to limits field
Support the IS_SHARED bit in the memory limits flag word.
The compiler does not create object files with memory definitions,
but the field is used by the linker.

Differential Revision: https://reviews.llvm.org/D54131

llvm-svn: 346246
2018-11-06 17:27:25 +00:00
Sanjay Patel 724014adde [InstCombine] allow vector types for fcmp+fpext fold
llvm-svn: 346245
2018-11-06 17:20:20 +00:00
Sanjay Patel db272b3720 [InstCombine] add vector test for fcmp+fpext; NFC
llvm-svn: 346243
2018-11-06 17:06:58 +00:00
Sanjay Patel 46bf3922c1 [InstCombine] propagate fast-math-flags when folding fcmp+fpext, part 2
llvm-svn: 346242
2018-11-06 16:45:27 +00:00
Sanjay Patel 1b85f00201 [InstCombine] propagate fast-math-flags when folding fcmp+fpext
llvm-svn: 346240
2018-11-06 16:23:03 +00:00
Sanjay Patel 6aea3071e8 [InstCombine] adjust tests to show dropping FMF; NFC
llvm-svn: 346239
2018-11-06 16:07:39 +00:00
Sanjay Patel 2fd5b0ebfb [InstCombine] propagate fast-math-flags when folding fcmp+fneg, part 2
llvm-svn: 346238
2018-11-06 15:58:57 +00:00
Sanjay Patel a166d19d93 [InstCombine] adjust tests to show dropping FMF; NFC
Also, remove some stale FIXME comments ( rL346234 ).

llvm-svn: 346236
2018-11-06 15:57:52 +00:00
Sanjay Patel 70282a0501 [InstCombine] propagate fast-math-flags when folding fcmp+fneg
This is another part of solving PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475

This might be enough to fix that particular issue, but as noted
with the FIXME, we're still dropping FMF on other folds around here.

llvm-svn: 346234
2018-11-06 15:49:45 +00:00
Sanjay Patel be985e33f0 [InstCombine] add tests for FMF propagation failure; NFC
llvm-svn: 346232
2018-11-06 15:21:44 +00:00
Simon Atanasyan bb36aea1d5 [mips] Support sigrie instruction
The `sigrie` instruction signals a Reserved Instruction Exception.
This patch adds support for assembling / disassembling the instruction.

Differential Revision: http://reviews.llvm.org/D53861

llvm-svn: 346230
2018-11-06 14:37:24 +00:00
Simon Pilgrim c1da5f757e [InstCombine] Ensure nested shifts are in range (OSS-Fuzz #9880)
llvm-svn: 346225
2018-11-06 11:28:22 +00:00
Dean Michael Berris 25f8d204b8 [XRay] Update XRayRecord to support Custom/Typed Events
Summary:
This change cuts across LLVM and compiler-rt to add support for
rendering custom events in the XRayRecord type, to allow for including
user-provided annotations in the output YAML (as raw bytes).

This work enables us to add custom event and typed event records into
the `llvm::xray::Trace` type for user-provided events. This can then be
programmatically handled through the C++ API and can be included in some
of the tooling as well. For now we support printing the raw data we
encounter in the custom events in the converted output.

Future work will allow us to start interpreting these custom and typed
events through a yet-to-be-defined API for extending the trace analysis
library.

Reviewers: mboerger

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D54139

llvm-svn: 346214
2018-11-06 08:51:37 +00:00
Max Kazantsev 69f6dfa0f8 [LICM] Use ICFLoopSafetyInfo in LICM
This patch makes LICM use `ICFLoopSafetyInfo` that is a smarter version
of LoopSafetyInfo that leverages power of Implicit Control Flow Tracking
to keep track of throwing instructions and give less pessimistic answers
to queries related to throws.

The ICFLoopSafetyInfo itself has been introduced in rL344601. This patch
enables it in LICM only.

Differential Revision: https://reviews.llvm.org/D50377
Reviewed By: apilipenko

llvm-svn: 346201
2018-11-06 02:44:49 +00:00
Max Kazantsev c210c65e77 [NFC] Add motivating test case for revert in rL346198
llvm-svn: 346199
2018-11-06 02:12:44 +00:00
Max Kazantsev e059f4452b Revert "[IndVars] Smart hard uses detection"
This reverts commit 2f425e9c7946b9d74e64ebbfa33c1caa36914402.

It seems that the check that we still should do the transform if we
know the result is constant is missing in this code. So the logic that
has been deleted by this change is still sometimes accidentally useful.
I revert the change to see what can be done about it. The motivating
case is the following:

@Y = global [400 x i16] zeroinitializer, align 1

define i16 @foo() {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i = phi i16 [ 0, %entry ], [ %inc, %for.body ]

  %arrayidx = getelementptr inbounds [400 x i16], [400 x i16]* @Y, i16 0, i16 %i
  store i16 0, i16* %arrayidx, align 1
  %inc = add nuw nsw i16 %i, 1
  %cmp = icmp ult i16 %inc, 400
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  %inc.lcssa = phi i16 [ %inc, %for.body ]
  ret i16 %inc.lcssa
}

We should be able to figure out that the result is constant, but the patch
breaks it.

Differential Revision: https://reviews.llvm.org/D51584

llvm-svn: 346198
2018-11-06 02:02:05 +00:00
Robert Widmann d36f3b0f92 [LLVM-C] Improve Intrinsics Bindings
Summary:
Improve the intrinsic bindings with operations for

- Retrieving and automatically inserting the declaration of an intrinsic by ID
- Retrieving the name of a non-overloaded intrinsic by ID
- Retrieving the name of an overloaded intrinsic by ID and overloaded parameter types

Improve the echo test to copy non-overloaded intrinsics by ID.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D53626

llvm-svn: 346195
2018-11-06 01:38:14 +00:00
Craig Topper 17057b52fe [X86] Autogenerate complete checks. NFC
llvm-svn: 346188
2018-11-06 00:31:27 +00:00
Sam Clegg 5292d17ec8 Revert "[WebAssembly] Fixup `main` signature by default"
This reverts rL345880.  It caused some test failures on the
webassembly waterfall.  e.g. binaryen2.test_mainenv fails due
the fact that `envp` ends up being undef rather than 0.

Differential Revision: https://reviews.llvm.org/D54117

llvm-svn: 346187
2018-11-06 00:31:02 +00:00
Justin Bogner d05345304c Specify REQUIRES: default_triple in two debuginfo tests
These were failing when specifying LLVM_DEFAULT_TARGET_TRIPLE=''

llvm-svn: 346185
2018-11-06 00:16:32 +00:00
Konstantin Zhuravlyov 108927b944 AMDGPU: Add sram-ecc feature
Differential Revision: https://reviews.llvm.org/D53222

llvm-svn: 346177
2018-11-05 22:44:19 +00:00
Sanjay Patel 1440107821 [InstSimplify] fold select (fcmp X, Y), X, Y
This is NFCI for InstCombine because it calls InstSimplify, 
so I left the tests for this transform there. As noted in
the code comment, we can allow this fold more often by using
FMF and/or value tracking.

llvm-svn: 346169
2018-11-05 21:51:39 +00:00
Sanjay Patel 72c2d355b7 [InstSimplify] add tests for select+fcmp; NFC
These are translated from InstCombine's test file with the same name.
We should move the transform from InstCombine to InstSimplify.

llvm-svn: 346168
2018-11-05 21:42:01 +00:00
Craig Topper ab896b08d4 [X86] Regenerate test checks in preparation for a patch. NFC
I'm preparing a patch to avoid creating critical edges in cmov expansion. Updating these tests to make the changes by the next patch easier to see.

llvm-svn: 346161
2018-11-05 19:45:37 +00:00
Taewook Oh 2b7ae47ccb [MergeICmps] Do not perform the transformation if GEP is used outside of block
Summary:
This patch prevents MergeICmps to performn the transformation if the address operand GEP of the load instruction has a use outside of the load's parent block. Without this patch, compiler crashes with the given test case because the use of `%first.i` is still around when the basic block is erased from https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/MergeICmps.cpp#L620. I think checking `isUsedOutsideOfBlock` with `GEP` is the original intention of the code, as the checking for `LoadI` is already performed in the same function.

This patch is incomplete though, as this makes the pass overly conservative and fails the test `tuple-four-int8.ll`. I believe what needs to be done is checking if GEP has a use outside of block that is not the part of "Comparisons" chain. Submit the patch as of now to prevent compiler crash.

Reviewers: courbet, trentxintong

Reviewed By: courbet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54089

llvm-svn: 346151
2018-11-05 18:16:32 +00:00
Sanjay Patel 1cfba9b5ed [InstCombine] add/adjust tests for fcmp+select substitution; NFC
There was no coverage for at least 2 out of the 4 patterns because
of fcmp canonicalization. The tests and code should be moved to
InstSimplify in a follow-up because this doesn't create any new values.

llvm-svn: 346150
2018-11-05 18:09:10 +00:00
Zaara Syeda 7509880b54 [Power9] Add support for stxvw4x.be and stxvd2x.be intrinsics
On Power9, we don't have patterns to select the following intrinsics:
llvm.ppc.vsx.stxvw4x.be
llvm.ppc.vsx.stxvd2x.be

This patch adds support for these.

Differential Revision: https://reviews.llvm.org/D53581

llvm-svn: 346148
2018-11-05 17:31:26 +00:00
Sanjay Patel c26fd1e772 [InstCombine] canonicalize -0.0 to +0.0 in fcmp
As stated in IEEE-754 and discussed in:
https://bugs.llvm.org/show_bug.cgi?id=38086
...the sign of zero does not affect any FP compare predicate.

Known regressions were fixed with:
rL346097 (D54001)
rL346143

The transform will help reduce pattern-matching complexity to solve:
https://bugs.llvm.org/show_bug.cgi?id=39475
...as well as improve CSE and codegen (a zero constant is almost always
easier to produce than 0x80..00).

llvm-svn: 346147
2018-11-05 17:26:42 +00:00
Sanjay Patel 87aa10062c [InstCombine] loosen FP 0.0 constraint for fcmp+select substitution
It looks like we correctly removed edge cases with 0.0 from D50714,
but we were a bit conservative because getBinOpIdentity() doesn't
distinguish between +0.0 and -0.0 and 'nsz' is effectively always
true for fcmp (see discussion in:
https://bugs.llvm.org/show_bug.cgi?id=38086

Without this change, we would get regressions by canonicalizing
to +0.0 in all fcmp, and that's a step towards solving:
https://bugs.llvm.org/show_bug.cgi?id=39475

llvm-svn: 346143
2018-11-05 16:50:44 +00:00
Sanjay Patel 8b2a1f7fd9 [InstCombine] adjust tests for select with FP identity op; NFC
These are mislabeled as negative tests.

llvm-svn: 346142
2018-11-05 16:27:03 +00:00
Cameron McInally 9757d5d6c1 [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics
Differential Revision: https://reviews.llvm.org/D53411

llvm-svn: 346141
2018-11-05 15:59:49 +00:00
Xin Tong 7ca744488f [ThinLTO] Add an option to disable (thin)lto internalization.
Summary:
LTO and ThinLTO optimizes the IR differently.

One source of differences is the amount of internalizations that
can happen.

Add an option to enable/disable internalization so that other
differences can be studied in isolation. e.g. inlining.

There are other things lto and thinlto do differently, I will add
flags to enable/disable them as needed.

Reviewers: tejohnson, pcc, steven_wu

Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, dang, llvm-commits

Differential Revision: https://reviews.llvm.org/D53294

llvm-svn: 346140
2018-11-05 15:49:46 +00:00
Sanjay Patel 92a53eabc6 [InstCombine] add/adjust tests for select with fsub identity op; NFC
llvm-svn: 346138
2018-11-05 15:45:01 +00:00
Cameron McInally 51a91e86e1 [NFCI][FPEnv] Split constrained intrinsic tests
The constrained intrinsic tests have grown in number. Split off
the FMA tests into their own file to reduce double coverage.

Differential Revision: https://reviews.llvm.org/D53932

llvm-svn: 346137
2018-11-05 15:28:10 +00:00
Sanjay Patel 278db2fba1 [InstCombine] add tests for select with FP identity op; NFC
llvm-svn: 346136
2018-11-05 15:08:36 +00:00
David Green ba9f245b0d [Inliner] Penalise inlining of calls with loops at Oz
We currently seem to underestimate the size of functions with loops in them,
both in terms of absolute code size and in the difficulties of dealing with
such code. (Calls, for example, can be tail merged to further reduce
codesize). At -Oz, we can then increase code size by inlining small loops
multiple times.

This attempts to penalise functions with loops at -Oz by adding a CallPenalty
for each top level loop in the function. It uses LI (and hence DT) to calculate
the number of loops. As we are dealing with minsize, the inline threshold is
small and functions at this point should be relatively small, making the
construction of these cheap.

Differential Revision: https://reviews.llvm.org/D52716

llvm-svn: 346134
2018-11-05 14:54:34 +00:00
Stefan Maksimovic 8d7c351799 [Mips] Supplement long branch pseudo instructions
Expand on LONG_BRANCH_LUi and LONG_BRANCH_(D)ADDiu pseudo
instructions by creating variants which support
less operands/accept GPR64Opnds as their operand in order
to appease the machine verifier pass.

Differential Revision: https://reviews.llvm.org/D53977

llvm-svn: 346133
2018-11-05 14:37:41 +00:00
Sam Parker 7275eec6e3 [NFC][ARM] Adding extra test for ARM CGP
Added a reproducer that I received a while ago.

llvm-svn: 346132
2018-11-05 14:17:27 +00:00
Neil Henning 233a02d0ed [AMDGPU] Fix the new atomic optimizer in pixel shaders.
The new atomic optimizer I previously added in D51969 did not work
correctly when a pixel shader was using derivatives, and had helper
lanes active.

To fix this we add an llvm.amdgcn.ps.live call that guards a branch
around the entire atomic operation - ensuring that all helper lanes are
inactive within the wavefront when we compute our atomic results.

I've added a test case that can cause derivatives, and exposes the
problem.

Differential Revision: https://reviews.llvm.org/D53930

llvm-svn: 346128
2018-11-05 12:04:48 +00:00
Sam Parker fec793c98f [ARM] Turn assert into condition in ARMCGP
Turn the assert in PrepareConstants into a conditon so that we can
handle mul instructions with negative immediates.

Differential Revision: https://reviews.llvm.org/D54094

llvm-svn: 346126
2018-11-05 11:26:04 +00:00
Sam Parker fcd8adab30 [ARM][ARMCGP] Remove unecessary zexts and truncs
r345840 slightly changed the way promotion happens which could
result in zext and truncs having the same source and destination
types. This fixes that issue.

We can now also remove the zext and trunc in the following case:
(zext (trunc (promoted op)), i32)

This means that we can no longer treat a value, that is only used by
a sink, to be safe to promote.

I've also added in some extra asserts and replaced a cast for a
dyn_cast.

Differential Revision: https://reviews.llvm.org/D54032

llvm-svn: 346125
2018-11-05 10:58:37 +00:00
Roman Lebedev 7db25f2b38 [NFC][x86][AArch64] extract-bits.ll: add test with 'ashr'.
llvm-svn: 346121
2018-11-05 09:20:08 +00:00
Dylan McKay 4c5a5c8db6 [AVR] Fix a backend bug that left extraneous operands after expansion
This patch fixes a bug in the AVR FRMIDX expansion logic.

The expansion would leave a leftover operand from the original FRMIDX,
but now attached to a MOVWRdRr instruction. The MOVWRdRr instruction
did not expect this operand and so LLVM rejected the machine
instruction.

This would trigger an assertion:

    Assertion failed: ((isImpReg || Op.isRegMask() || MCID->isVariadic() ||
                        OpNo < MCID->getNumOperands() || isMetaDataOp) &&
                        "Trying to add an operand to a machine instr that is already done!"),
    function addOperand, file llvm/lib/CodeGen/MachineInstr.cpp

Tim fixed this so that now the FRMIDX is expanded correctly into
a well-formed MOVWRdRr.

Patch by Tim Neumann

llvm-svn: 346117
2018-11-05 05:49:04 +00:00
Craig Topper 30b627e5c9 [X86] Custom type legalize v2i8/v2i16/v2i32 mul to use to pmuludq.
v2i8/v2i16/v2i32 are promoted to v2i64. pmuludq takes a v2i64 input and produces a v2i64 output. Since we don't about the upper bits of the type legalized multiply we can use the pmuludq to produce the multiply result for the bits we do care about.

llvm-svn: 346115
2018-11-05 05:02:12 +00:00
Dylan McKay 9a9ae99b30 [AVR] Disallow the LDDWRdPtrQ instruction with Z as the destination
This is an AVR-specific workaround for a limitation of the register
allocator that only exposes itself on targets with high register
contention like AVR, which only has three pointer registers.

The three pointer registers are X, Y, and Z.
In most nontrivial functions, Y is reserved for the frame pointer,
as per the calling convention. This leaves X and Z. Some instructions,
such as LPM ("load program memory"), are only defined for the Z
register. Sometimes this just leaves X.

When the backend generates a LDDWRdPtrQ instruction with Z as the
destination pointer, it usually trips up the register allocator
with this error message:

  LLVM ERROR: ran out of registers during register allocation

This patch is a hacky workaround. We ban the LDDWRdPtrQ instruction
from ever using the Z register as an operand. This gives the
register allocator a bit more space to allocate, fixing the
regalloc exhaustion error.

Here is a description from the patch author Peter Nimmervoll

  As far as I understand the problem occurs when LDDWRdPtrQ uses
  the ptrdispregs register class as target register. This should work, but
  the allocator can't deal with this for some reason. So from my testing,
  it seams like (and I might be totally wrong on this) the allocator reserves
  the Z register for the ICALL instruction and then the register class
  ptrdispregs only has 1 register left and we can't use Y for source and
  destination. Removing the Z register from DREGS fixes the problem but
  removing Y register does not.

More information about the bug can be found on the avr-rust issue
tracker at https://github.com/avr-rust/rust/issues/37.

A bug has raised to track the removal of this workaround and a proper
fix; PR39553 at https://bugs.llvm.org/show_bug.cgi?id=39553.

Patch by Peter Nimmervoll

llvm-svn: 346114
2018-11-05 05:00:44 +00:00
Craig Topper 60789b34e0 [X86] Fix typo in test comment. NFC
llvm-svn: 346110
2018-11-05 01:21:52 +00:00
Vedant Kumar d2a895a972 [HotColdSplitting] Use TTI to inform outlining threshold
Using TargetTransformInfo allows the splitting pass to factor in the
code size cost of instructions as it decides whether or not outlining is
profitable.

This did not regress the overall amount of outlining seen on the handful
of internal frameworks I tested.

Thanks to Jun Bum Lim for suggesting this!

Differential Revision: https://reviews.llvm.org/D53835

llvm-svn: 346108
2018-11-04 23:11:57 +00:00
Craig Topper 6d3c713689 [X86] Add nounwind to some tests to remove cfi directives from checks. NFC
llvm-svn: 346106
2018-11-04 21:37:45 +00:00
Craig Topper a3210b2713 [X86] Regenerate test checks to merge 32 and 64 bit. Remove stale check prefixes. NFC
llvm-svn: 346105
2018-11-04 21:37:43 +00:00
Craig Topper ed6a0a817f [X86] Add vector shift by immediate to SimplifyDemandedBitsForTargetNode.
Summary: This also enables some constant folding from KnownBits propagation. This helps on some cases vXi64 case in 32-bit mode where constant vectors appear as vXi32 and a bitcast. This can prevent getNode from constant folding sra/shl/srl.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54069

llvm-svn: 346102
2018-11-04 17:31:27 +00:00
Sanjay Patel e7c94ef1de [ValueTracking] determine sign of 0.0 from select when matching min/max FP
In PR39475:
https://bugs.llvm.org/show_bug.cgi?id=39475
..we may fail to recognize/simplify fabs() in some cases because we do not 
canonicalize fcmp with a -0.0 operand.

Adding that canonicalization can cause regressions on min/max FP tests, so 
that's this patch: for the purpose of determining whether something is min/max, 
let the value returned by the select determine how we treat a 0.0 operand in the fcmp.

This patch doesn't actually change the -0.0 to +0.0. It just changes the analysis, so 
we don't fail to recognize equivalent min/max patterns that only differ in the 
signbit of 0.0.

Differential Revision: https://reviews.llvm.org/D54001

llvm-svn: 346097
2018-11-04 14:28:48 +00:00
Sanjay Patel cac28b452e [ValueTracking] peek through 2-input shuffles in ComputeNumSignBits
This patch gives the IR ComputeNumSignBits the same functionality as the 
DAG version (the code is derived from the existing code).

This an extension of the single input shuffle analysis added with D53659.

Differential Revision: https://reviews.llvm.org/D53987

llvm-svn: 346071
2018-11-03 13:18:55 +00:00
Reid Kleckner 2bcb288ade [codeview] Let the X86 backend tell us the VFRAME offset adjustment
Use MachineFrameInfo's OffsetAdjustment field to pass this information
from the target to CodeViewDebug.cpp. The X86 backend doesn't use it for
any other purpose.

This fixes PR38857 in the case where there is a non-aligned quantity of
CSRs and a non-aligned quantity of locals.

llvm-svn: 346062
2018-11-03 00:41:52 +00:00
Wolfgang Pieb 5253cccbd5 [DWARF v5] Verifier: Add checks for DW_FORM_strx* forms.
Adding functionality to the DWARF verifier for DWARF v5 strx* forms which 
index into the string offsets table.

Differential Revision: https://reviews.llvm.org/D54049

llvm-svn: 346061
2018-11-03 00:27:35 +00:00
Teresa Johnson 7a92bc3e61 [LTO] Fix a crash caused by accessing an empty ValueInfo
ModuleSummaryIndex::exportToDot crashes when linking the Linux kernel
under ThinLTO using LLVMgold.so. This is due to the exportToDot
function trying to get the GUID of an empty ValueInfo. The root cause
related to the fact that we attempt to get the GUID of an aliasee
via its OriginalGUID recorded in the aliasee summary, and that is not
always possible. Specifically, we cannot do this mapping when the value
is internal linkage and there were other internal linkage symbols with
the same name.

There are 2 fixes for the problem included here.

1) In all cases where we can currently print the dot file from the
command line (which is only via save-temps), we have a valid AliaseeGUID
in the AliasSummary. Use that when it is available, so that we can get
the correct aliasee GUID whenever possible.

2) However, if we were to invoke exportToDot from the debugger right
after it is built during the initial analysis step (i.e. the per-module
summary), we won't have the AliaseeGUID field populated. In that case,
we have a fallback fix that will simply print "@"+GUID when we aren't
able to get the GUID from the OriginalGUID. It simply checks if the VI
is valid or not before attempting to get the name. Additionally, since
getAliaseeGUID will assert that the AliaseeGUID is non-zero, guard the
earlier fix #1 by a new function hasAliaseeGUID().

Reviewers: pcc, tmroeder

Subscribers: evgeny777, mehdi_amini, inglorion, dexonsmith, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53986

llvm-svn: 346055
2018-11-02 23:49:21 +00:00
Craig Topper f7108aef14 [X86] In LowerEXTEND_VECTOR_INREG, emit a vector shuffle instead of directly using X86ISD::UNPCKL
The majority of the changes are because the rest of shuffle lowering/combining prefers to replace the undef input with the other operand. Using UNPCKL directly seemed to avoid this and just grabbed a randomish register for the undef which can create false dependencies.

llvm-svn: 346050
2018-11-02 22:48:02 +00:00
Wouter van Oortmerssen de28b5d17f [WebAssembly] Parsing missing directives to produce valid .o
Summary:
The assembler was able to assemble and then dump back to .s, but
was failing to parse certain directives necessary for valid .o
output:
- .type directives are now recognized to distinguish function symbols
  and others.
- .size is now parsed to provide function size.
- .globaltype (introduced in https://reviews.llvm.org/D54012) is now
  recognized to ensure symbols like __stack_pointer have a proper type
  set for both .s and .o output.

Also added tests for the above.

Reviewers: sbc100, dschuff

Subscribers: jgravelle-google, aheejin, dexonsmith, kristina, llvm-commits, sunfish

Differential Revision: https://reviews.llvm.org/D53842

llvm-svn: 346047
2018-11-02 22:04:33 +00:00
Craig Topper 60c202a494 [X86] Don't emit *_extend_vector_inreg nodes when both the input and output types are legal with AVX1
We already have custom lowering for the AVX case in LegalizeVectorOps. So its better to keep the regular extend op around as long as possible.

I had to qualify one place in DAG combine that created illegal vector extending load operations. This change by itself had no effect on any tests which is why its included here.

I've made a few cleanups to the custom lowering. The sign extend code no longer creates an identity shuffle with undef elements. The zero extend code now emits a zero_extend_vector_inreg instead of an unpckl with a zero vector.

For the high half of the custom lowering of zero_extend/any_extend, we're now using an unpckh with a zero vector or undef. Previously we used used a pshufd to move the upper 64-bits to the lower 64-bits and then used a zero_extend_vector_inreg. I think the zero vector should require less execution resources and be smaller code size.

Differential Revision: https://reviews.llvm.org/D54024

llvm-svn: 346043
2018-11-02 21:09:49 +00:00
Fangrui Song 999570a2f4 [DWARF] Fix typo, .gnu_index -> .gdb_index
llvm-svn: 346039
2018-11-02 20:34:40 +00:00
Eli Friedman d2941b43f4 [AArch64] [Windows] Misc fixes for llvm-readobj -unwind.
Use getImageBase() helper to compute the image base. Fix various
offsets/addresses/masks so they're actually correct.

This allows decoding unwind info from DLLs, and unwind info from object
files containing multiple functions.

Differential Revision: https://reviews.llvm.org/D54015

llvm-svn: 346036
2018-11-02 19:59:08 +00:00
Alex Bradbury 52c27785ce [RISCV] Add some missing expansions for floating-point intrinsics
A number of intrinsics, such as llvm.sin.f32, would result in a failure to 
select. This patch adds expansions for the relevant selection DAG nodes, as 
well as exhaustive testing for all f32 and f64 intrinsics.

The codegen for FMA remains a TODO item, pending support for the various 
RISC-V FMA instruction variants.

The llvm.minimum.f32.* and llvm.maximum.* tests are commented-out, pending 
upstream support for target-independent expansion, as discussed in 
http://lists.llvm.org/pipermail/llvm-dev/2018-November/127408.html.

Differential Revision: https://reviews.llvm.org/D54034
Patch by Luís Marques.

llvm-svn: 346034
2018-11-02 19:50:38 +00:00
Simon Pilgrim 88e8763bae [X86][AVX512] Change mask ops on vpermi2var tests to not use zeroinitializer.
This is necessary as I'm wanting to remove the 'Constant Pool' shuffle decoding from getTargetShuffleMask - but using getTargetShuffleMaskIndices allows the shuffle combiner to realize that these calls are really broadcasts.....

As with a lot of the X86ISD::VPERMV3 code this causes some vperm2i/vperm2t shuffles to flip depending on optimal commutation.

llvm-svn: 346032
2018-11-02 19:39:41 +00:00
Heejin Ahn 5b023e07ea [WebAssembly] Fix bugs in rethrow depth counting and InstPrinter
Summary:
EH stack depth is incremented at `try` and decremented at `catch`. When
there are more than two catch instructions for a try instruction, we
shouldn't count non-first catches when calculating EH stack depths.

This patch fixes two bugs:
- CFGStackify: Exclude `catch_all` in the terminate catch pad when
  calculating EH pad stack, because when we have multiple catches for a
  try we should count only the first catch instruction when calculating
  EH pad stack.
- InstPrinter: The initial intention was also to exclude non-first
  catches, but it didn't account nested try-catches, so it failed on
  this case:
```
try
  try
  catch
  end
catch    <-- (1)
end
```
In the example, when we are at the catch (1), the last seen EH
instruction is not `try` but `end_try`, violating the wrong assumption.

We don't need these after we switch to the second proposal because there
is gonna be only one `catch` instruction. But anyway before then these
bugfixes are necessary for keep trunk in working state.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53819

llvm-svn: 346029
2018-11-02 18:38:52 +00:00
Jordan Rupprecht 80e7e86c29 [DebugInfo][InstMerge] Fix -debugify for phi node created by -mldst-motion
Summary:
-mldst-motion creates a new phi node without any debug info. Use the merged debug location from the incoming stores to fix this.

Fixes PR38177. The test case here is (somewhat) simplified from:

```
struct S {
  int foo;
  void fn(int bar);
};
void S::fn(int bar) {
  if (bar)
    foo = 1;
  else
    foo = 0;
}
```

Reviewers: dblaikie, gbedwell, aprantl, vsk

Reviewed By: vsk

Subscribers: vsk, JDevlieghere, llvm-commits

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D54019

llvm-svn: 346027
2018-11-02 18:25:41 +00:00
Matthias Braun 5f7cb79e94 ARMExpandPseudoInsts: Fix CMP_SWAP expansion adding a kill flag to a def
llvm-svn: 346026
2018-11-02 18:22:15 +00:00
Leonard Mosescu 4bdbea3ce2 Fix a few small issues in llvm-pdbutil
Running "llvm-pdbutil dump -all" on linux (using the native PDB reader),
over a few PDBs pulled from the Microsoft public symbol store uncovered 
a few small issues:

- stripped PDBs might not have the strings stream (/names)
- stripped PDBs might not have the "module info" stream

Differential Revision: https://reviews.llvm.org/D54006

llvm-svn: 346010
2018-11-02 18:00:37 +00:00
Jonas Paulsson cced2a2775 [SystemZ::TTI] Improve cost handling of uint/sint to fp conversions.
Let i8/i16 uint/sint to fp conversions cost 1 if operand is a load.

Since the load already does the extension, there is no extra cost (previously
returned 2).

Review: Ulrich Weigand
https://reviews.llvm.org/D54028

llvm-svn: 346009
2018-11-02 17:53:31 +00:00
Easwaran Raman c5e1506ec8 [ProfileSummary] Add options to override hot and cold count thresholds.
Summary:
The hot and cold count thresholds are derived from the summary, but for
debugging purposes it is convenient to provide the actual thresholds.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54040

llvm-svn: 346005
2018-11-02 17:39:31 +00:00
Jonas Paulsson 79f2441eee [SystemZ] Rework getInterleavedMemoryOpCost()
Model this function more closely after the BasicTTIImpl version, with
separate handling of loads and stores. For loads, the set of actually loaded
vectors is checked.

This makes it more readable and just slightly more accurate generally.

Review: Ulrich Weigand
https://reviews.llvm.org/D53071

llvm-svn: 345998
2018-11-02 17:15:36 +00:00
Jeremy Morse d538352b3e [MachineSink][DebugInfo] Correctly sink DBG_VALUEs
As reported in PR38952, postra-machine-sink relies on DBG_VALUE insns being
adjacent to the def of the register that they reference. This is not always
true, leading to register copies being sunk but not the associated DBG_VALUEs,
which gives the debugger a bad variable location.

This patch collects DBG_VALUEs as we walk through a BB looking for copies to
sink, then passes them down to performSink. Compile-time impact should be
negligable.

Differential Revision: https://reviews.llvm.org/D53992

llvm-svn: 345996
2018-11-02 16:52:48 +00:00
Krzysztof Parzyszek f070544f8e [Hexagon] Do not reduce load size for globals in small-data
Small-data (i.e. GP-relative) loads and stores allow 16-bit scaled
offset. For a load of a value of type T, the small-data area is
equivalent to an array "T sdata[65536]". This implies that objects
of smaller sizes need to be closer to the beginning of sdata,
while larger objects may be farther away, or otherwise the offset
may be insufficient to reach it. Similarly, an object of a larger
size should not be accessed via a load of a smaller size.

llvm-svn: 345975
2018-11-02 14:17:47 +00:00
Alexey Bataev 8831ef7a16 [DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or only debug directives are requested.
Summary:
If the output of debug directives only is requested, we should drop
emission of ',debug' option from the target directive. Required for
supporting of nvprof profiler.

Reviewers: probinson, echristo, dblaikie

Subscribers: Hahnfeld, jholewinski, llvm-commits, JDevlieghere, aprantl

Differential Revision: https://reviews.llvm.org/D46061

llvm-svn: 345972
2018-11-02 13:47:47 +00:00
Simon Pilgrim cdcbeb4997 [DAGCombiner] Remove reduceBuildVecConvertToConvertBuildVec and rely on the vectorizers instead (PR35732)
reduceBuildVecConvertToConvertBuildVec vectorizes int2float in the DAGCombiner, which means that even if the LV/SLP has decided to keep scalar code using the cost models, this will override this.

While there are cases where vectorization is necessary in the DAG (mainly due to legalization artefacts), I don't think this is the case here, we should assume that the vectorizers know what they are doing.

Differential Revision: https://reviews.llvm.org/D53712

llvm-svn: 345964
2018-11-02 11:06:18 +00:00
Ayal Zaks 45a3ca7be7 [LV] Avoid vectorizing loops under opt for size that involve SCEV checks
Fix PR39417, PR39497

The loop vectorizer may generate runtime SCEV checks for overflow and stride==1
cases, leading to execution of original scalar loop. The latter is forbidden
when optimizing for size. An assert introduced in r344743 triggered the above
PR's showing it does happen. This patch fixes this behavior by preventing
vectorization in such cases.

Differential Revision: https://reviews.llvm.org/D53612

llvm-svn: 345959
2018-11-02 09:16:12 +00:00
Dean Michael Berris 8a3ef6f3c3 [XRay] Fix tests with updated fdr-dump
Follow-up to D54022.

llvm-svn: 345955
2018-11-02 08:35:46 +00:00
Matt Arsenault 8e0269ba0b AMDGPU: Fix assertion with bitcast from i64 constant to v4i16
llvm-svn: 345922
2018-11-02 02:43:55 +00:00
Matthias Braun fdddd8e734 test/DebugInfo: Convert some tests to MIR
These tests are meant to test dwarf emission (or prolog/epilogue
generation) so we can convert them to .mir and only run the relevant
part of the pipeline.
This way they become independent of changes in earlier passes such as my
planned changes to RegAllocFast.

llvm-svn: 345919
2018-11-02 01:31:52 +00:00
Wouter van Oortmerssen 3231e518a3 [WebAssembly] Added a .globaltype directive to .s output.
Summary:
Assembly output can use globals like __stack_pointer implicitly,
but has no way of indicating the type of such a global, which makes
it hard for tools processing it (such as the MC Assembler) to
reconstruct this information.

The improved assembler directives parsing (in progress in
https://reviews.llvm.org/D53842) will make use of this information.

Also deleted code for the .import_global directive which was unused.

New test case in userstack.ll

Reviewers: dschuff, sbc100

Subscribers: jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54012

llvm-svn: 345917
2018-11-02 00:45:00 +00:00
Thomas Lively b2382c8bf7 [WebAssembly] General vector shift lowering
Summary: Adds support for lowering non-splat shifts.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53625

llvm-svn: 345916
2018-11-02 00:39:57 +00:00
Thomas Lively fb84fd7c8e [WebAssembly] Expand inserts and extracts with variable indices
Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53964

llvm-svn: 345913
2018-11-02 00:06:56 +00:00
Mandeep Singh Grang 547a0d765a [COFF, ARM64] Implement Intrinsic.sponentry for AArch64
Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform.

Patch by: Yin Ma (yinma@codeaurora.org)

Reviewers: mgrang, ssijaric, eli.friedman, TomTan, mstorsjo, rnk, compnerd, efriedma

Reviewed By: efriedma

Subscribers: efriedma, javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53996

llvm-svn: 345909
2018-11-01 23:22:25 +00:00
Craig Topper e2483020f2 [DAGCombiner] Make the isTruncateOf call from visitZERO_EXTEND work for vectors. Remove FIXME.
I'm having trouble creating a test case for the ISD::TRUNCATE part of this that shows any codegen differences. But I was able to test the setcc path which is what the test changes here cover.

llvm-svn: 345908
2018-11-01 23:21:45 +00:00
Craig Topper 7a782cce35 [X86] Add test cases for adding vector support to isTruncateOf in DAGCombiner::visitZERO_EXTEND
llvm-svn: 345907
2018-11-01 23:21:42 +00:00
Jessica Paquette c991cf3687 [MachineOutliner][NFC] Remember when you map something illegal across MBBs
Instruction mapping in the outliner uses "illegal numbers" to signify that
something can't ever be part of an outlining candidate. This means that the
number is unique and can't be part of any repeated substring.

Because each of these is unique, we can use a single unique number to represent
a range of things we can't outline.

The outliner tries to leverage this using a flag which is set in an MBB when
the previous instruction we tried to map was "illegal". This patch improves
that logic to work across MBBs. As a bonus, this also simplifies the mapping
logic somewhat.

This also updates the machine-outliner-remarks test, which was impacted by the
order of Candidates on an OutlinedFunction changing. This order isn't
guaranteed, so I added a FIXME to fix that in a follow-up. The order of
Candidates on an OutlinedFunction isn't important, so this still is NFC.

llvm-svn: 345906
2018-11-01 23:09:06 +00:00
Farhana Aleen 5853762e5a [AMDGPU] Handle the idot8 pattern generated by FE.
Summary: Different variants of idot8 codegen dag patterns are not generated by llvm-tablegen due to a huge
         increase in the compile time. Support the pattern that clang FE generates after reordering the
         additions in integer-dot8 source language pattern.

Author: FarhanaAleen

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D53937

llvm-svn: 345902
2018-11-01 22:48:19 +00:00
Mandeep Singh Grang df19e57a1c [COFF, ARM64] Implement llvm.addressofreturnaddress intrinsic
Reviewers: rnk, mstorsjo, efriedma, TomTan

Reviewed By: efriedma

Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53962

llvm-svn: 345892
2018-11-01 21:23:47 +00:00
Heejin Ahn 2e398976ba [WebAssembly] Fix signature parsing for 'try' in AsmParser
Summary:
Like `block` or `loop`, `try` can take an optional signature which can
be omitted. This patch allows `try`'s signature to be omitted. Also
added some tests for EH instructions.

Reviewers: aardappel

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D53873

llvm-svn: 345888
2018-11-01 20:32:15 +00:00
Sam Clegg ddf049869a [WebAssembly] Fixup `main` signature by default
Differential Revision: https://reviews.llvm.org/D53396

llvm-svn: 345880
2018-11-01 19:38:44 +00:00