Commit Graph

47767 Commits

Author SHA1 Message Date
Saleem Abdulrasool aff96d907b X86: treat SwiftCC as Win64_CC on Win64
The Swift CC is identical to Win64 CC with the exception of swift error
being passed in r12 which is a CSR.  However, since this calling
convention is only used in swift -> swift code, it does not impact
interoperability and can be treated entirely as Win64 CC.  We would
previously incorrectly lower the frame setup as we did not treat the
frame as conforming to Win64 specifications.

llvm-svn: 313813
2017-09-20 21:00:40 +00:00
Matt Arsenault 76935122cc AMDGPU: Start selecting v_mad_mixlo_f16
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.

llvm-svn: 313806
2017-09-20 20:28:39 +00:00
Matt Arsenault 644883ff07 AMDGPU: Fix encoding of op_sel for mad_mix* opcodes
llvm-svn: 313797
2017-09-20 19:09:28 +00:00
Sam Clegg d95ed959d8 Reland "[WebAssembly] Add support for naming wasm data segments"
Add adds support for naming data segments.  This is useful
useful linkers so that they can merge similar sections.

Differential Revision: https://reviews.llvm.org/D37886

llvm-svn: 313795
2017-09-20 19:03:35 +00:00
Saleem Abdulrasool 432b88e5f4 CodeGen: support SwiftError SwiftCC on Windows x64
Add support for passing SwiftError through a register on the Windows x64
calling convention.  This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter.  This partially enables building the swift standard library
for Windows x86_64.

llvm-svn: 313791
2017-09-20 18:40:59 +00:00
Marek Sokolowski c2189b8311 [llvm-readobj] Teach readobj to dump .res files (WindowsResource).
This enables readobj to output Windows resource files (.res). This way,
we'll be able to test .res outputs without comparing them byte-by-byte
with "magic binary files" generated by MS toolchain.

Differential Revision: https://reviews.llvm.org/D38058

llvm-svn: 313790
2017-09-20 18:33:35 +00:00
Reid Kleckner 4e04028791 Re-land "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
After r313775, it's easier to maintain a parallel BitVector of spilled
locations indexed by location number.

I wasn't able to build a good reduced test case for this iteration of
the bug, but I added a more direct assertion that spilled values must
use frame index locations. If this bug reappears, it won't only fire on
the NEON vector code that we detected it on, but on medium-sized
integer-only programs as well.

llvm-svn: 313786
2017-09-20 18:19:08 +00:00
Hans Wennborg 57c3341ada Revert r313771 "[SLP] Vectorize jumbled memory loads."
This broke the buildbots, e.g.
http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/391

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask'
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Subscribers: mzolotukhin
>
> Reviewed By: ayal
>
> Differential Revision: https://reviews.llvm.org/D36130
>
> Review comments updated accordingly
>
> Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0
>
> Added a TODO for sortLoadAccesses API
>
> Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58
>
> Modified the TODO for sortLoadAccesses API
>
> Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565
>
> Review comment update for using OpdNum to insert the mask in respective location
>
> Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce
>
> Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase
>
> Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b

llvm-svn: 313781
2017-09-20 18:00:03 +00:00
Adrian Prantl d3f9f2138d llvm-dwarfdump: implement --recurse-depth=<N>
This patch implements the Darwin dwarfdump option --recurse-depth=<N>,
which limits the recursion depth when selectively printing DIEs at an
offset.

Differential Revision: https://reviews.llvm.org/D38064

llvm-svn: 313778
2017-09-20 17:44:00 +00:00
Quentin Colombet aa103b3d86 [InstCombine] Add select simplifications
In these cases, two selects have constant selectable operands for
both the true and false components and have the same conditional
expression.
We then create two arithmetic operations of the same type and feed a
final select operation using the result of the true arithmetic for the true
operand and the result of the false arithmetic for the false operand and reuse
the original conditionl expression.
The arithmetic operations are naturally folded as a consequence, leaving
only the newly formed select to replace the old arithmetic operation.

Patch by: Michael Berg <michael_c_berg@apple.com>
Differential Revision: https://reviews.llvm.org/D37019

llvm-svn: 313774
2017-09-20 17:32:16 +00:00
Jake Ehrlich a45afd50d4 Reland "[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr"
I did not upload two binaries that I reference in tests.

This change adds support for sections involved in dynamic loading such
as SHT_DYNAMIC, SHT_DYNSYM, and allocated string tables.

The two added binaries used for tests can be downloaded here and here

Differential Revision: https://reviews.llvm.org/D36560

llvm-svn: 313772
2017-09-20 17:22:06 +00:00
Mohammad Shahid 2b281de576 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask'
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Subscribers: mzolotukhin

Reviewed By: ayal

Differential Revision: https://reviews.llvm.org/D36130

Review comments updated accordingly

Change-Id: I22ab0a8a9bac9d49d74baa81a08e1e486f5e75f0

Added a TODO for sortLoadAccesses API

Change-Id: I3c679bf1865422d1b45e17ea28f1992bca660b58

Modified the TODO for sortLoadAccesses API

Change-Id: Ie64a66cb5f9e2a7610438abb0e750c6e090f9565

Review comment update for using OpdNum to insert the mask in respective location

Change-Id: I016d0c1b29874e979efc0205bbf078991f92edce

Fixes '-Wsign-compare warning' in LoopAccessAnalysis.cpp and code rebase

Change-Id: I64b2ea5e68c1d7b6a028f5ef8251c5a97333f89b
llvm-svn: 313771
2017-09-20 17:19:57 +00:00
Jake Ehrlich e5d424b8dc Reland "[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr"
I overzealously landed this before I was sure that another change
wouldn't break the build that this change depends on.

This change adds support for sections involved in dynamic loading such
as SHT_DYNAMIC, SHT_DYNSYM, and allocated string tables.

The two added binaries used for tests can be downloaded here and here

Differential Revision: https://reviews.llvm.org/D36560

llvm-svn: 313767
2017-09-20 17:11:58 +00:00
Teresa Johnson f625118ec7 [ThinLTO] Fix dead stripping analysis for SamplePGO
Summary:
The fix for dead stripping analysis in the case of SamplePGO indirect
calls to local functions (r313151) introduced the possibility of an
infinite loop.

Make sure we check for the value being already live after we update it
for SamplePGO indirect call handling.

Reviewers: danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D38086

llvm-svn: 313766
2017-09-20 17:09:47 +00:00
David Blaikie 1d5d44ff05 DebugInfo: Remove unneeded attributes from test/DebugInfo/Generic/imported-name-inlined.ll
Remove unneeded attributes from test/DebugInfo/Generic/imported-name-inlined.ll because it was causing failures on pure MIPS builds.

Patch by Miloš Stojanović!

Differential Revision: https://reviews.llvm.org/D38079

llvm-svn: 313762
2017-09-20 15:59:57 +00:00
Simon Atanasyan 8bdbb29524 [mips] Add a valid test case to check the reason of the recent build-bot failure. NFC
llvm-svn: 313761
2017-09-20 15:57:25 +00:00
Alexander Kornienko 6a140234ed Revert r313736: "[SLP] Vectorize jumbled memory loads."
The revision breaks buildbots:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/6694/steps/test/logs/stdio

llvm-svn: 313758
2017-09-20 14:53:07 +00:00
Simon Pilgrim d202ad15c1 [X86][SSE] Add PR22415 test case
llvm-svn: 313755
2017-09-20 13:49:52 +00:00
Florian Hahn ceb4494786 Recommit [MachineCombiner] Update instruction depths incrementally for large BBs.
This version of the patch fixes an off-by-one error causing PR34596. We
do not need to use std::next(BlockIter) when calling updateDepths, as
BlockIter already points to the next element.

Original commit message:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.

> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313751
2017-09-20 11:54:37 +00:00
George Rimar 0eb2f30b0b Revert r313746 "[yaml2obj] - Don't crash on invalid document."
It broke BB:
http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/9781

llvm-svn: 313748
2017-09-20 10:24:37 +00:00
George Rimar cefe7e1142 [yaml2obj] - Don't crash on invalid document.
Previously jaml2obj would segfault on empty document.
(without yaml description).
Patch fixes the issue.

Differential revision: https://reviews.llvm.org/D38036

llvm-svn: 313746
2017-09-20 09:57:11 +00:00
Mikael Holmen 06064d1bac [IfConversion] Add testcases [NFC]
These tests should have been included in r310697 / D34099 but apparently
I missed them.

llvm-svn: 313737
2017-09-20 08:23:29 +00:00
Mohammad Shahid f8db9bd857 [SLP] Vectorize jumbled memory loads.
Summary:
This patch tries to vectorize loads of consecutive memory accesses, accessed
in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
which was reverted back due to some basic issue with representing the 'use mask' of
jumbled accesses.

This patch fixes the mask representation by recording the 'use mask' in the usertree entry.

Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df

Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh

Reviewed By: Ayal

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D36130

Commit after rebase for patch D36130

Change-Id: I8add1c265455669ef288d880f870a9522c8c08ab
llvm-svn: 313736
2017-09-20 08:18:28 +00:00
Andrew V. Tischenko 92980ce6aa 'into' instruction should not be decoded as a valid instr in 64-bit mode
llvm-svn: 313735
2017-09-20 08:17:17 +00:00
Matt Arsenault b81495dccb AMDGPU: Match load d16 hi instructions
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.

We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.

llvm-svn: 313716
2017-09-20 05:01:53 +00:00
Stanislav Mekhanoshin 5670e6d482 [AMDGPU] Port of HSAIL inliner
Differential Revision: https://reviews.llvm.org/D36849

llvm-svn: 313714
2017-09-20 04:25:58 +00:00
Matt Arsenault fcc213fab7 AMDGPU: Match store d16_hi instructions
llvm-svn: 313712
2017-09-20 03:20:09 +00:00
Sanjoy Das 09613b122e Tighten the invariants around LoopBase::invalidate
Summary:
With this change:
 - Methods in LoopBase trip an assert if the receiver has been invalidated
 - LoopBase::clear frees up the memory held the LoopBase instance

This change also shuffles things around as necessary to work with this stricter invariant.

Reviewers: chandlerc

Subscribers: mehdi_amini, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38055

llvm-svn: 313708
2017-09-20 02:31:57 +00:00
Mike Edwards b487bf45f0 Reverting due to Green Dragon bot failure.
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/42594/

llvm-svn: 313706
2017-09-20 01:21:02 +00:00
Quentin Colombet d652aeb144 [MIRPrinter] Print empty successor lists when they cannot be guessed
This re-applies commit r313685, this time with the proper updates to
the test cases.

Original commit message:
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print

         entry
        /      \
   true (def)   false (no list of successors)
       |
 split.true (use)

The MIR parser would understand this:

         entry
        /      \
   true (def)   false
       |        /  <-- invalid edge
 split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313696
2017-09-19 23:34:12 +00:00
Sam Clegg b292c25966 [WebAssembly] Add support for naming wasm data segments
Add adds support for naming data segments.  This is useful
useful linkers so that they can merge similar sections.

Differential Revision: https://reviews.llvm.org/D37886

llvm-svn: 313692
2017-09-19 23:00:57 +00:00
Quentin Colombet 6888dbcda7 Revert "[MIRPrinter] Print empty successor lists when they cannot be guessed"
This reverts commit r313685.

I thought I had ran ninja check, but apparently I didn't...
Need to update a bunch of mir tests.

llvm-svn: 313686
2017-09-19 22:03:50 +00:00
Quentin Colombet 7fdaa5e641 [MIRPrinter] Print empty successor lists when they cannot be guessed
Unreachable blocks in the machine instr representation are these
weird empty blocks with no successors.
The MIR printer used to not print empty lists of successors. However,
the MIR parser now treats non-printed list of successors as "please
guess it for me". As a result, the parser tries to guess the list of
successors and given the block is empty, just assumes it falls through
the next block (if any).

For instance, the following test case used to fail the verifier.
The MIR printer would print
          entry
         /      \
    true (def)   false (no list of successors)
        |
  split.true (use)

The MIR parser would understand this:
          entry
         /      \
    true (def)   false
        |        /  <-- invalid edge
  split.true (use)

Because of the invalid edge, we get the "def does not
dominate all uses" error.

The fix consists in printing empty successor lists, so that the parser
knows what to do for unreachable blocks.

rdar://problem/34022159

llvm-svn: 313685
2017-09-19 21:55:51 +00:00
Jake Ehrlich d246b0a284 Reland "[llvm-objcopy] Add support for nested and overlapping segments"
I didn't initialize a pointer to be nullptr that I needed to.

This change adds support for nested and even overlapping segments. This means
that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly.

Differential Revision: https://reviews.llvm.org/D36558

llvm-svn: 313682
2017-09-19 21:37:35 +00:00
Jonathan Roelofs 85908aa84b [ARM] Relax 'cpsie'/'cpsid' flag parsing.
The ARM docs suggest in examples that the flags can have either case, and there
are applications in the wild that (libopencm3, for example) that expect to be
able to use the uppercase spelling.

https://reviews.llvm.org/D37953

llvm-svn: 313680
2017-09-19 21:23:19 +00:00
Reid Kleckner ffdf087499 Revert "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
This reverts r313640, originally r313400, one more time for essentially
the same issue. My BitVector of spilled location numbers isn't working
because we coalesce identical DBG_VALUE locations as we rewrite them,
invalidating the location numbers used to index the BitVector.

llvm-svn: 313679
2017-09-19 21:18:32 +00:00
Dehao Chen 62b9c33e1e Import all inlined indirect call targets for SamplePGO.
Summary: In the ThinLTO compilation, if a function is inlined in the profiling binary, we need to inline it before annotation. If the callee is not available in the primary module, a first step is needed to import that callee function. For the current implementation, if the call is an indirect call, which has been promoted to >1 targets and inlined, SamplePGO will only import one target with the largest sample count. This patch fixed the bug to import all targets instead.

Reviewers: tejohnson, davidxl

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D36637

llvm-svn: 313678
2017-09-19 21:18:14 +00:00
Adrian Prantl ccc21c459b llvm-dwarfdump: un-hide more command line options
llvm-svn: 313673
2017-09-19 20:58:57 +00:00
Adrian Prantl b780d909fe Move test into non-target-specific directory.
llvm-svn: 313672
2017-09-19 20:58:56 +00:00
Stanislav Mekhanoshin d4ae470d2e [AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.

Differential Revision: https://reviews.llvm.org/D38014

llvm-svn: 313670
2017-09-19 20:54:38 +00:00
Ulrich Weigand 59a01a958a [SystemZ] Fix truncstore + bswap codegen bug
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.

This transformation is correct for regular stores, but not for
truncating stores.  The routine neglected to check for that case.

Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.

llvm-svn: 313669
2017-09-19 20:50:05 +00:00
Saleem Abdulrasool 5c37d57e15 Revert "ExecutionEngine: add R_AARCH64_ABS{16,32}"
This reverts commit SVN r313654.  Seems that it is triggering an
assertion on Windows specifically.  Revert until I can build on Windows
and look into what is happening there.

llvm-svn: 313668
2017-09-19 20:35:25 +00:00
Jake Ehrlich 317782122c Revert "[llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr"
This reverts commit r313663. Broken because overlapping-sections was
reverted.

llvm-svn: 313665
2017-09-19 20:00:04 +00:00
Jake Ehrlich 8f108248ba Revert "[llvm-objcopy] Add support for nested and overlapping segments"
This reverts commit r313656. Appears to be broken on Windows.

llvm-svn: 313664
2017-09-19 19:52:09 +00:00
Jake Ehrlich f20c3f4333 [llvm-objcopy] Add support for .dynamic, .dynsym, and .dynstr
This change adds support for sections involved in dynamic loading such
as SHT_DYNAMIC, SHT_DYNSYM, and allocated string tables.

The two added binaries used for tests can be downloaded [[
https://drive.google.com/file/d/0B3gtIAmiMwZXOXE3T0RobFg4ZTg/view?usp=sharing
| here ]] and [[
https://drive.google.com/file/d/0B3gtIAmiMwZXTFJSQUJZMGxNSXc/view?usp=sharing
| here ]]

Differential Revision: https://reviews.llvm.org/D36560

llvm-svn: 313663
2017-09-19 19:21:09 +00:00
David Blaikie 610e03a009 Fix test to not depend on another subdirectories Input directory
Inputs should be placed local to the test (or possibly in a common
parent? I think we do that in some places - but the only common parent
between these two directories is 'test' which seems a bit overly broad).

llvm-svn: 313662
2017-09-19 19:20:08 +00:00
Jake Ehrlich 523560ef57 [llvm-objcopy] Add test to check that architecture specific values are not used on wrong architecture.
This change adds a test that checks the an error is produced when a hexagon
specific reserved section index is used but e_machine is not EM_HEXAGON.

Differential Revision: https://reviews.llvm.org/D38017

llvm-svn: 313661
2017-09-19 19:05:15 +00:00
Dehao Chen b6e60c8b80 Handle profile mismatch correctly for SamplePGO.
Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash.

Reviewers: davidxl, tejohnson, eraman

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38018

llvm-svn: 313658
2017-09-19 18:26:54 +00:00
Reid Kleckner 26fa1bf4da Re-land "Fix Bug 30978 by emitting cv file checksums."
This reverts r313431 and brings back r313374 with a fix to write
checksums as binary data and not ASCII hex strings.

llvm-svn: 313657
2017-09-19 18:14:45 +00:00
Jake Ehrlich 0a84b1ac80 [llvm-objcopy] Add support for nested and overlapping segments
This change adds support for nested and even overlapping segments. This means
that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly.

Differential Revision: https://reviews.llvm.org/D36558

llvm-svn: 313656
2017-09-19 18:14:03 +00:00
Saleem Abdulrasool 6567ecd741 ExecutionEngine: add R_AARCH64_ABS{16,32}
Add support for the R_AARCH64_ABS{16,32} relocations in the execution
engine.  This is primarily used for DWARF debug information relocations
and needed by the LLVM JIT to support JITing for lldb.

Patch by Alex Langford!

llvm-svn: 313654
2017-09-19 18:00:50 +00:00
Reid Kleckner 86c74dd791 Re-land r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
I forgot to zero out the BitVector when reusing it between UserValues.

Later uses of the same location number for a different UserValue would
falsely indicate that they were spilled. Usually this would lead to
incorrect debug info, but in some cases they would indicate something
nonsensical like a memory location based on a vector register (Q8 on
ARM).

llvm-svn: 313640
2017-09-19 16:32:15 +00:00
Tony Jiang 2d9c5f3b8b [PowerPC Peephole] Constants into a join add, use ADDI over LI/ADD.
Two blocks prior to the join each perform an li and the the join block has an
add using the initialized register. Optimize each predecessor block to instead
use addi and delete the li's and add.

Differential Revision: https://reviews.llvm.org/D36734

llvm-svn: 313639
2017-09-19 16:14:37 +00:00
Evandro Menezes 0a98abc67c [AArch64] Extend tests of loads and stores of register pairs
Include instances of FP register pairs.

llvm-svn: 313638
2017-09-19 15:46:35 +00:00
Tony Jiang 425071eff3 [Power9] Add missing Power9 instructions.
The following 8 instructions are implemented in this patch.
addpcis(subpcis, lnia), darn, maddhd, maddhdu, maddld, setb

llvm-svn: 313636
2017-09-19 15:22:36 +00:00
Daniel Sanders 83e23d1398 [globalisel] Add a G_BSWAP instruction and support bswap using it.
llvm-svn: 313633
2017-09-19 14:25:15 +00:00
Simon Pilgrim d5e2878252 [X86][SSE] Add 'redundant pand' test case from PR34620
llvm-svn: 313632
2017-09-19 14:02:16 +00:00
Sanjay Patel bd7958d7ca [x86] regenerate checks; NFC
llvm-svn: 313631
2017-09-19 13:43:09 +00:00
Alexey Bataev 10ad32b76d [SLP] Reduce test, NFC.
llvm-svn: 313630
2017-09-19 13:38:56 +00:00
Daniel Sanders 000327742f [globalisel] Add support for intrinsic_void
llvm-svn: 313629
2017-09-19 13:23:01 +00:00
Daniel Sanders 28887fe548 [globalisel] Add support for intrinsic_w_chain.
This maps directly to G_INTRINSIC_W_SIDE_EFFECTS.

llvm-svn: 313627
2017-09-19 12:56:36 +00:00
Jina Nahias ccfb8d4fe8 [x86] Lowering Mask Set1 intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D37669

llvm-svn: 313625
2017-09-19 11:03:06 +00:00
Roger Ferrer Ibanez 8d0180c955 [ARM] Use ADDCARRY / SUBCARRY
This is a preparatory step for D34515.

This change:
 - makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
 - lowering is done by first converting the boolean value into the carry flag
   using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
   using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
   operations does the actual addition.
 - for subtraction, given that ISD::SUBCARRY second result is actually a
   borrow, we need to invert the value of the second operand and result before
   and after using ARMISD::SUBE. We need to invert the carry result of
   ARMISD::SUBE to preserve the semantics.
 - given that the generic combiner may lower ISD::ADDCARRY and
   ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
   as well otherwise i64 operations now would require branches. This implies
   updating the corresponding test for unsigned.
 - add new combiner to remove the redundant conversions from/to carry flags
   to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
 - fixes PR34045
 - fixes PR34564

Differential Revision: https://reviews.llvm.org/D35192

llvm-svn: 313618
2017-09-19 09:05:39 +00:00
Andrei Elovikov 142516b456 Test commit.
llvm-svn: 313617
2017-09-19 07:56:20 +00:00
Matt Arsenault e745d9963e AMDGPU: Run internalize symbols at -O0
The relocations used for externally visible functions
aren't supported, so the direct call emitted ends
up hitting a linker error.

llvm-svn: 313616
2017-09-19 07:40:11 +00:00
Gadi Haber 6f8fbf4b86 [X86][Skylake] Adding the scheduling information for the SkylakeClient target
This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879.

Please expect some performance fluctuations due to code alignment effects.

Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena
Differential Revision: https://reviews.llvm.org/D37294

llvm-svn: 313613
2017-09-19 06:19:27 +00:00
Craig Topper a80949feb5 [X86] Add VPERMPD/VPERMQ and VPERMPS/VPERMD to the execution domain fixing table.
llvm-svn: 313610
2017-09-19 04:39:55 +00:00
Vedant Kumar b7fdaf2cd4 [llvm-cov] Make report metrics agree with line exec counts, fixes PR34615
Use the same logic as the line-oriented coverage view to determine the
number of covered lines in a function.

Fixes llvm.org/PR34615.

llvm-svn: 313604
2017-09-19 02:00:12 +00:00
Vedant Kumar ad8f637bd8 [Coverage] Use gap regions to select better line exec counts
After clang started emitting deferred regions (r312818), llvm-cov has
had a hard time picking reasonable line execuction counts. There have
been one or two generic improvements in this area (e.g r310012), but
line counts can still report coverage for whitespace instead of code
(llvm.org/PR34612).

To fix the problem:

 * Introduce a new region kind so that frontends can explicitly label
   gap areas.

   This is done by changing the encoding of the columnEnd field of
   MappingRegion. This doesn't substantially increase binary size, and
   makes it easy to maintain backwards-compatibility.

 * Don't set the line count to a count from a gap area, unless the count
   comes from a wrapped segment.

 * Don't highlight gap areas as uncovered.

Fixes llvm.org/PR34612.

llvm-svn: 313597
2017-09-18 23:37:28 +00:00
Vedant Kumar 27d1b14ab0 [llvm-cov] Repair a test. NFC.
The checks with the MARKER prefix were not being run over the right
input, because stderr was not redirected properly.

llvm-svn: 313596
2017-09-18 23:37:27 +00:00
Yonghong Song 9ef85f0677 bpf: add inline-asm support
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 313593
2017-09-18 23:29:36 +00:00
Yi Kong bb4b4eef61 [ThinLTO/gold] Implement ThinLTO cache pruning support
Differential Revision: https://reviews.llvm.org/D37993

llvm-svn: 313592
2017-09-18 23:24:55 +00:00
Hans Wennborg 4de620ab3d Revert r313400 "[DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs"
This caused asserts in Chromium. See http://crbug.com/766261

> Summary:
> This comes up in optimized debug info for C++ programs that pass and
> return objects indirectly by address. In these programs,
> llvm.dbg.declare survives optimization, which causes us to emit indirect
> DBG_VALUE instructions. The fast register allocator knows to insert
> DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
> LiveDebugVariables did not until this change.
>
> This fixes part of PR34513. I need to look into why this doesn't work at
> -O0 and I'll send follow up patches to handle that.
>
> Reviewers: aprantl, dblaikie, probinson
>
> Subscribers: qcolombet, hiraditya, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D37911

llvm-svn: 313589
2017-09-18 23:08:42 +00:00
Zachary Turner d4401d354a [lit] Update clang and lld to use new config helpers.
NFC intended here, this only updates clang and lld's lit configs
to use some helper functionality in the lit.llvm submodule.

llvm-svn: 313579
2017-09-18 22:26:48 +00:00
Sanjay Patel f31b1a00ea [DAGCombiner] fold assertzexts separated by trunc
If we have an AssertZext of a truncated value that has already been AssertZext'ed, 
we can assert on the wider source op to improve the zext-y knowledge:
 assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN

This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.

Differential Revision: https://reviews.llvm.org/D37017

llvm-svn: 313577
2017-09-18 22:05:35 +00:00
Sanjay Patel 709a804a0a [InstCombine] auto-generate complete checks; NFC
The code responsible for these transforms has the potential to add 2 
instructions and break min/max patterns (PR33301).

llvm-svn: 313575
2017-09-18 21:57:56 +00:00
Adrian Prantl c2bc717028 llvm-dwarfdump: add a --show-parents options when selectively dumping DIEs.
llvm-svn: 313567
2017-09-18 21:27:44 +00:00
Adrian Prantl f077b5b885 Fix typo in testcase.
llvm-svn: 313566
2017-09-18 21:27:42 +00:00
Konstantin Zhuravlyov ca8946a376 AMDGPU: Start selecting s_xnor_{b32, b64}
Differential Revision: https://reviews.llvm.org/D37981

llvm-svn: 313565
2017-09-18 21:22:45 +00:00
Sanjay Patel 7765c93be2 [DAG, x86] allow store merging before and after legalization (PR34217)
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217

We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.

I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.

Differential Revision: https://reviews.llvm.org/D37987

llvm-svn: 313564
2017-09-18 20:54:26 +00:00
Craig Topper 39cdb84560 [X86] Make sure we still emit zext for GR32 to GR64 when the source of the zext is AssertZext
The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext.

This fixes PR28540.

Differential Revision: https://reviews.llvm.org/D37729

llvm-svn: 313563
2017-09-18 20:49:13 +00:00
Alexey Bataev 286fe620ef [SLP] Add a test for PR34635, NFC.
llvm-svn: 313559
2017-09-18 19:33:30 +00:00
Sanjay Patel 74d12b5697 [x86] add tests for PR34217; NFC
llvm-svn: 313548
2017-09-18 18:07:50 +00:00
Simon Pilgrim 4aa28b9730 [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for 256-bit vector compare results.
As commented on D37849, AVX1 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts

llvm-svn: 313547
2017-09-18 17:58:31 +00:00
Sanjay Patel 078d5d978c [x86] regenerate checks; NFC
llvm-svn: 313545
2017-09-18 17:33:47 +00:00
Manoj Gupta 7476f629ed [LoopVectorizer] Add more testcases for PR33804.
Summary:
Add test cases when float <-> pointer types conversion is triggered
in presence of load instructions.

Reviewers: Ayal, srhines, mkuper, rengolin

Reviewed By: rengolin

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D37967

llvm-svn: 313544
2017-09-18 17:28:15 +00:00
Simon Pilgrim 0b21ef1fa3 [SelectionDAG] Add BITCAST handling to ComputeNumSignBits for splatted sign bits.
For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.

We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.

Differential Revision: https://reviews.llvm.org/D37849

llvm-svn: 313543
2017-09-18 16:45:05 +00:00
Craig Topper 77d7f331dd [X86] Fix two more places to prefer VPERMQ/PD over VPERM2X128 when AVX2 is enabled
The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there.

If we have VPERMQ/PD we should prefer those since they only have a single input.

Differential Revision: https://reviews.llvm.org/D37947

llvm-svn: 313542
2017-09-18 16:39:49 +00:00
Sam Parker 3fa0ccffc6 [AArch64] Add V8_2aOps feature to Cortex-A55 and 75
Add the missing hardware features the ProcA55 and ProcA75 feature.
These are already enabled via the target parser, but I had missed
them in the backend.

Differential Revision: https://reviews.llvm.org/D37974

llvm-svn: 313535
2017-09-18 14:46:14 +00:00
Sam Parker 71efbe4c68 [ARM] Implement isTruncateFree
Implement the isTruncateFree hooks, lifted from AArch64, that are
used by TargetTransformInfo. This allows simplifycfg to reduce the
test case into a single basic block.

Differential Revision: https://reviews.llvm.org/D37516

llvm-svn: 313533
2017-09-18 14:28:51 +00:00
Simon Pilgrim 00161c9961 [X86][SSE] Improve support for vselect(Cond, 0, X) -> ANDN(Cond, X)
As discussed on PR28925 and D37849.

Differential Revision: https://reviews.llvm.org/D37975

llvm-svn: 313532
2017-09-18 14:23:23 +00:00
Sjoerd Meijer 4e6df15962 [ARM] Fix for indexed dot product instruction descriptions
The indexed dot product instructions only accept the lower 16 D-registers as
the indexed register, but we were e.g. incorrectly accepting:

vudot.u8 d16,d16,d18[0]

Differential Revision: https://reviews.llvm.org/D37968

llvm-svn: 313531
2017-09-18 14:17:57 +00:00
Jonas Devlieghere c0a758d8ab [dwarfdump] Make .eh_frame an alias for .debug_frame
This patch makes the `.eh_frame` extension an alias for `.debug_frame`.
Up till now it was only possible to dump the section using objdump, but
not with dwarfdump. Since the two are essentially interchangeable, we
dump whichever of the two is present.

As a workaround, this patch also adds parsing for 3 currently
unimplemented CFA instructions: `DW_CFA_def_cfa_expression`,
`DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the
required knowledge, I just parse the fields without actually creating
the instructions.

Finally, this also fixes the typo in the `.debug_frame` section name
which incorrectly contained a trailing `s`.

Differential revision: https://reviews.llvm.org/D37852

llvm-svn: 313530
2017-09-18 14:15:57 +00:00
Simon Pilgrim 360629d170 [X86][SSE] Add vselect with zero tests (PR28925)
llvm-svn: 313529
2017-09-18 13:32:33 +00:00
Nikolai Bozhenov 84af99b3b1 [X86FixupBWInsts] More precise register liveness if no <imp-use> on MOVs.
Summary:
Subregister liveness tracking is not implemented for X86 backend, so
sometimes the whole super register is said to be live, when only a
subregister is really live. That might happen if the def and the use
are located in different MBBs, see added fixup-bw-isnt.mir test.

However, using knowledge of the specific instructions handled by the
bw-fixup-pass we can get more precise liveness information which this
change does.

Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper

Reviewed By: craig.topper

Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya

Patch by Andrei Elovikov <andrei.elovikov@intel.com>

Differential Revision: https://reviews.llvm.org/D37559

llvm-svn: 313524
2017-09-18 10:17:59 +00:00
Mohammed Agabaria 77cb080c2d [X86][Codegen] adding masked gathers tests for avx2
related to patch: https://reviews.llvm.org/D35772
adding llvm gathers test before gathers codegen support.

Differential Revision: https://reviews.llvm.org/D37800

llvm-svn: 313516
2017-09-18 06:49:54 +00:00
Dean Michael Berris 0f84a7d355 [XRay][tools] Support tail-call exits before we write them in the runtime
Summary:
This change adds support for explicit tail-exit records to be written by
the XRay runtime. This lets us differentiate the tail exit
records/events in the log, and allows us to treat those exit events
especially in the future. For now we allow printing those out in YAML
(and reading them in).

Reviewers: kpw, pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37964

llvm-svn: 313514
2017-09-18 06:08:46 +00:00
Craig Topper a6054328e8 [X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain.
MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter.

llvm-svn: 313509
2017-09-18 04:40:58 +00:00
Craig Topper 87f7381edf [X86] Teach execution domain fixing to convert between FP and int unpack instructions.
llvm-svn: 313508
2017-09-18 03:29:54 +00:00
Craig Topper d4341920d5 [X86] Teach execution domain fixing to convert between VPERMILPS and VPSHUFD.
llvm-svn: 313507
2017-09-18 03:29:47 +00:00
Craig Topper ee6646d7de [X86] Teach shuffle lowering to use MOVLHPS/MOVHLPS for lowering v4f32 unary shuffles with SSE1 only.
llvm-svn: 313504
2017-09-17 22:36:41 +00:00
Craig Topper 6c221690a3 [X86] Add a couple more unary shuffles to the sse1 shuffle test.
These can be implemented with movlhps and movhlps.

llvm-svn: 313503
2017-09-17 22:36:39 +00:00
Jatin Bhateja 356e3e2c1d Adding test cases for PR34629 & PR34634.
Differential Revision: https://reviews.llvm.org/D37962

llvm-svn: 313490
2017-09-17 18:16:26 +00:00
Alex Bradbury 8ab4a9696a [RISCV] Add support for disassembly
This Disassembly support allows for 'round-trip' testing, and rv32i-valid.s
has been updated appropriately.

Differential Revision: https://reviews.llvm.org/D23567

llvm-svn: 313486
2017-09-17 14:36:28 +00:00
Alex Bradbury 6758ecb98c [RISCV] Add support for all RV32I instructions
This patch supports all RV32I instructions as described in the RISC-V manual.
A future patch will add support for pseudoinstructions and other instruction
expansions (e.g. 0-arg fence -> fence iorw, iorw).

Differential Revision: https://reviews.llvm.org/D23566

llvm-svn: 313485
2017-09-17 14:27:35 +00:00
Igor Breger f1d388a5c5 [GlobalISel][X86] Legalize i1 G_ADD/G_SUB/G_MUL/G_XOR/G_OR/G_AND instructions.
llvm-svn: 313483
2017-09-17 11:34:17 +00:00
Igor Breger 0f382ccb68 [GlobalISel][X86] Use correct physical register in mir tests.NFC.
llvm-svn: 313479
2017-09-17 08:30:42 +00:00
Igor Breger 21200ed7af [GlobalISel][X86] G_FCONSTANT support.
Summary: G_FCONSTANT support, port the implementation from X86FastIsel.

Reviewers: zvi, delena, guyblank

Reviewed By: delena

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D37734

llvm-svn: 313478
2017-09-17 08:08:13 +00:00
Zachary Turner 6326d56195 [llvm-symbolizer] Fix coff-dwarf.test
This was a bug in the test that was only exposed as a result of
refactoring some code in lit configuration files.  Previously,
llvm's lit configuration would only set the target-windows feature
if the system was also windows.  Since cross-compilation is
a thing, this isn't correct.  target-windows should be set
independently of system-windows.

Adding to that bug, this particular test then checked for
target-windows when it really meant "can I call a certain API on
the host machine", which is what system-windows is for.

Ultimately, this test only works if *both* the target and host
are Windows, so I've updated the test to reflect that.

llvm-svn: 313468
2017-09-16 19:01:04 +00:00
Zachary Turner c3023d1bd1 Resubmit "Add a shared llvm.lit module that all test suites can use."
There were some issues surrounding Py2 / Py3 compatibility, but
I've now tested with both Py2 and Py3 and everything seems to
work.

llvm-svn: 313467
2017-09-16 18:46:21 +00:00
Adrian Prantl 597aa48d11 llvm-dwarfdump: support a --show-children option
This will print all children of a DIE when selectively printing only
one DIE at a given offset.

llvm-svn: 313464
2017-09-16 17:28:00 +00:00
Adrian Prantl 099d7e452a llvm-dwarfdump: Add support for -debug-types=<offset>.
llvm-svn: 313463
2017-09-16 16:58:18 +00:00
George Rimar 762abff698 [llvm-readobj] - Teach tool to report error if some section is in multiple COMDAT groups at once.
readelf tool reports an error when output contains the same section
in multiple COMDAT groups. That can be useful.
Path teaches llvm-readobj to do the same.

Differential revision: https://reviews.llvm.org/D37567

llvm-svn: 313459
2017-09-16 14:29:51 +00:00
Sanjay Patel 65d6780703 [x86] enable storeOfVectorConstantIsCheap() target hook
This allows vector-sized store merging of constants in DAGCombiner using the existing code in MergeConsecutiveStores(). 
All of the twisted logic that decides exactly what vector operations are legal and fast for each particular CPU are 
handled separately in there using the appropriate hooks.

For the motivating tests in merge-store-constants.ll, we already produce the same vector code in IR via the SLP vectorizer. 
So this is just providing a backend backstop for code that doesn't go through that pass (-O1). More details in PR24449:
https://bugs.llvm.org/show_bug.cgi?id=24449 (this change should be the last step to resolve that bug)

Differential Revision: https://reviews.llvm.org/D37451

llvm-svn: 313458
2017-09-16 13:29:12 +00:00
Craig Topper 23f78c1662 [X86] Add isel patterns to be able to fold loads into VPERM2F128 even when the load is on the first input to the SDNode.
We just need to toggle bits 1 and 5 of the immediate and swap the sources. The peephole pass could trigger commuting/folding for this later, but its easy enough to fix in isel.

Disable the peephole pass on the main vperm2x128 test so we know we're doing this through isel.

llvm-svn: 313455
2017-09-16 09:16:48 +00:00
Craig Topper 0d1b519f78 [X86] Remove unused check lines that got left behind when I moved tests to the instrinsic upgrade file and regenerated.
llvm-svn: 313454
2017-09-16 09:16:46 +00:00
Craig Topper 8374ffde08 [X86] Remove the vperm2f128 test file I just added in r313450.
I missed the we already had a pretty thorough test file for these instructions.

llvm-svn: 313451
2017-09-16 07:51:01 +00:00
Craig Topper f264fcc704 [X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles.
I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates.

llvm-svn: 313450
2017-09-16 07:36:14 +00:00
Craig Topper aa499c1cb2 [X86] Fix some FileCheck lines that use the wrong prefix.
Assume they were moved during autoupgrading and not changed.

llvm-svn: 313448
2017-09-16 07:13:39 +00:00
Craig Topper 950b19515a [X86] Don't set reserved bits in the immediate in the test cases for vperm2f128.
I'm going to autoupgrade these intrinsics in a future commit. This bit will never be set in the resulting output so pre-removing the bit.

llvm-svn: 313434
2017-09-16 02:11:21 +00:00
Craig Topper 9313df747d [X86] Remove slash in front of a CHECK line in a test.
llvm-svn: 313433
2017-09-16 01:43:21 +00:00
Eric Beckmann 913213c8ae Revert "Fix Bug 30978 by emitting cv file checksums."
This reverts commit 6389e7aa724ea7671d096f4770f016c3d86b0d54.

There is a bug in this implementation where the string value of the
checksum is outputted, instead of the actual hex bytes.  Therefore the
checksum is incorrect, and this prevent pdbs from being loaded by visual
studio.  Revert this until the checksum is emitted correctly.

llvm-svn: 313431
2017-09-16 01:14:36 +00:00
Zachary Turner 525b09d347 Revert lit changes related to lit.llvm module.
It looks like this is going to be non-trivial to get working
in both Py2 and Py3, so for now I'm reverting until I have time
to fully test it under Python 3.

llvm-svn: 313429
2017-09-16 00:52:49 +00:00
Zachary Turner 2aa5a92bdb Resubmit "[lit] Add a lit.llvm module that all llvm projects can use"
This was reverted alongside the revert of the lit/llvm-lit refactor,
but now that that has re-landed, I'm relanding this as well.

llvm-svn: 313426
2017-09-16 00:25:58 +00:00
Craig Topper d02179cd9c [X86] Remove usages of vperm2f intrinsics from fast isel tests to match what clang generates after r313418.
llvm-svn: 313424
2017-09-15 23:53:43 +00:00
Adrian Prantl 057d336c0d llvm-dwarfdump: Add support for -debug-info=<offset>.
This is the first of many commits that enable selectively dumping just
one record from the debug info.

This reapplies r313412 with some extra qualification to appease GCC and MSVC.

llvm-svn: 313419
2017-09-15 23:04:04 +00:00
Vedant Kumar 51d8f887db [llvm-cov] Avoid over-counting covered lines and regions
* Fix an unsigned integer overflow in the logic that computes the
  number of uncovered lines in a function.

* When aggregating region and line coverage summaries, take into account
  that different instantiations may have a different number of regions.

The new test case provides test coverage for both bugs. I also verified
this change by preparing a coverage report for a stage2 build of llc --
the new assertions should detect any outstanding over-counting bugs.

Fixes PR34613.

llvm-svn: 313417
2017-09-15 23:00:02 +00:00
Adrian Prantl b5abcc558d Revert "llvm-dwarfdump: Add support for -debug-info=<offset>."
This reverts commit r313412 because of a g++ incompatibility.

llvm-svn: 313413
2017-09-15 22:47:16 +00:00
Adrian Prantl fb5d284e97 llvm-dwarfdump: Add support for -debug-info=<offset>.
This is the first of many commits that enable selectively dumping just
one record from the debug info.

llvm-svn: 313412
2017-09-15 22:37:56 +00:00
Guozhi Wei 3d1305f6da [TargetTransformInfo] Static alloca has 0 cost
Static alloca usually doesn't generate any machine instructions, so it has 0 cost.

Differential Revision: https://reviews.llvm.org/D37879

llvm-svn: 313410
2017-09-15 22:28:12 +00:00
Chandler Carruth beb22b5437 [SLP] Revert r312791 and other necessary commits, except for TTI and
CostModel.

The original patch added support for horizontal min/max reductions to
the SLP vectorizer.

This patch causes LLVM to miscompile fairly simple signed min
reductions. I have attached a test progrom to http://llvm.org/PR34635
that shows the behavior change after this patch. We found this in a test
for the open source Eigen library, but also in other code.

Unfortunately, the revert is moderately challenging. It required
reverting:
r313042: [SLP] Test with multiple uses of conditional op and wrong parent.
r312853: [SLP] Fix buildbots, NFC.
r312793: [SLP] Fix the warning about paths not returning the value, NFC.
r312791: [SLP] Support for horizontal min/max reduction.

And even then, I had to completely skip reverting the changes to TTI and
CostModel because r312832 rewrote so much of this code. Plus, the cost
modeling changes aren implicated in the miscompile, so they should be
fine and will just not be used until this gets re-introduced.

llvm-svn: 313409
2017-09-15 22:23:27 +00:00
Zachary Turner ce92db13ea Resubmit "[lit] Force site configs to run before source-tree configs"
This is a resubmission of r313270.  It broke standalone builds of
compiler-rt because we were not correctly generating the llvm-lit
script in the standalone build directory.

The fixes incorporated here attempt to find llvm/utils/llvm-lit
from the source tree returned by llvm-config.  If present, it
will generate llvm-lit into the output directory.  Regardless,
the user can specify -DLLVM_EXTERNAL_LIT to point to a specific
lit.py on their file system.  This supports the use case of
someone installing lit via a package manager.  If it cannot find
a source tree, and -DLLVM_EXTERNAL_LIT is either unspecified or
invalid, then we print a warning that tests will not be able
to run.

Differential Revision: https://reviews.llvm.org/D37756

llvm-svn: 313407
2017-09-15 22:10:46 +00:00
Reid Kleckner 3a66c1cb58 [DebugInfo] Insert DW_OP_deref when spilling indirect DBG_VALUEs
Summary:
This comes up in optimized debug info for C++ programs that pass and
return objects indirectly by address. In these programs,
llvm.dbg.declare survives optimization, which causes us to emit indirect
DBG_VALUE instructions. The fast register allocator knows to insert
DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
LiveDebugVariables did not until this change.

This fixes part of PR34513. I need to look into why this doesn't work at
-O0 and I'll send follow up patches to handle that.

Reviewers: aprantl, dblaikie, probinson

Subscribers: qcolombet, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37911

llvm-svn: 313400
2017-09-15 21:54:38 +00:00
Reid Kleckner 9e6c309ef3 [DebugInfo] Add missing DW_OP_deref when an NRVO pointer is spilled
Summary:
Fixes PR34513.

Indirect DBG_VALUEs typically come from dbg.declares of non-trivially
copyable C++ objects that must be passed by address. We were already
handling the case where the virtual register gets allocated to a
physical register and is later spilled. That's what usually happens for
normal parameters that aren't NRVO variables: they usually appear in
physical register parameters, and are spilled later in the function,
which would correctly add deref.

NRVO variables are different because the dbg.declare can come much later
after earlier instructions cause the incoming virtual register to be
spilled.

Also, clean up this code. We only need to look at the first operand of a
DBG_VALUE, which eliminates the operand loop.

Reviewers: aprantl, dblaikie, probinson

Subscribers: MatzeB, qcolombet, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37929

llvm-svn: 313399
2017-09-15 21:49:56 +00:00
Steven Wu ab211df5de [AutoUpgrade] Fix a compatibility issue with module flag
Summary:
After r304661, module flag to record objective-c image info section is
encoded without whitespaces after comma. The new name is equivalent to
the old one, except that when LTO a module built by old compiler and a
module built by a new compiler, it will fail with conflicting values.

Fix the issue by removing whitespaces in bitcode upgrade path.

rdar://problem/34416934

Reviewers: compnerd

Reviewed By: compnerd

Subscribers: mehdi_amini, hans, llvm-commits

Differential Revision: https://reviews.llvm.org/D37909

llvm-svn: 313398
2017-09-15 21:12:14 +00:00
Sam Clegg 759631c77b [WebAssembly] MC: Create wasm data segments based on MCSections
This means that we can honor -fdata-sections rather than
always creating a segment for each symbol.

It also allows for a followup change to add .init_array and friends.

Differential Revision: https://reviews.llvm.org/D37876

llvm-svn: 313395
2017-09-15 20:54:59 +00:00
Davide Italiano dee018c51f [ConstantFold] Return the correct type when folding a GEP with vector indices.
As Eli pointed out (and I got wrong in the first place), langref says: "The
getelementptr returns a vector of pointers, instead of a single address, when one
or more of its arguments is a vector. In such cases, all vector arguments should
have the same number of elements, and every scalar argument will be effectively
broadcast into a vector during address calculation."

Costantfold for gep doesn't really take in account this paragraph, returning a
pointer instead of a vector of pointer which triggers an assertion in RAUW,
as we're trying to replace values with mistmatching types.

Differential Revision:  https://reviews.llvm.org/D37928

llvm-svn: 313394
2017-09-15 20:53:05 +00:00
Vivek Pandya b5ab895e2a This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352
It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with
command line options.
The diagnostic handler used to be callback now this patch adds a class
DiagnosticHandler. It has virtual method to provide custom diagnostic handler
and methods to control which particular remarks are enabled. 
However LLVM-C API users can still provide callback function for diagnostic handler.

llvm-svn: 313390
2017-09-15 20:10:09 +00:00
Vivek Pandya df8598dcc4 This reverts r313381
llvm-svn: 313387
2017-09-15 19:53:54 +00:00
Vivek Pandya 00d887447b This patch fixes https://bugs.llvm.org/show_bug.cgi?id=32352
It enables OptimizationRemarkEmitter::allowExtraAnalysis and MachineOptimizationRemarkEmitter::allowExtraAnalysis to return true not only for -fsave-optimization-record but when specific remarks are requested with
command line options.
The diagnostic handler used to be callback now this patch adds a class
DiagnosticHandler. It has virtual method to provide custom diagnostic handler
and methods to control which particular remarks are enabled. 
However LLVM-C API users can still provide callback function for diagnostic handler.

llvm-svn: 313382
2017-09-15 19:30:59 +00:00
Sam Clegg aff1c4df25 [WebAssembly] MC: Fix crash in getProvitionalValue on weak references
- Create helper function for resolving weak references.
- Add test that preproduces the crash.

Differential Revision: https://reviews.llvm.org/D37916

llvm-svn: 313381
2017-09-15 19:22:01 +00:00
Hans Wennborg 534bfbd3ba Revert r313343 "[X86] PR32755 : Improvement in CodeGen instruction selection for LEAs."
This caused PR34629: asserts firing when building Chromium. It also broke some
buildbots building test-suite as reported on the commit thread.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>           T1 = A + B
>           T2 = T1 + 10
>           T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>           Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>           leal 1(%rax,%rcx,1), %rdx
>           leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>           leal 1(%rax,%rcx,1), %rdx
>           leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet
>
> Reviewed By: lsaba
>
> Subscribers: spatel, igorb, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313376
2017-09-15 18:40:26 +00:00
Eric Beckmann 349746f044 Fix Bug 30978 by emitting cv file checksums.
Summary:
The checksums had already been placed in the IR, this patch allows
MCCodeView to actually write it out to an MCStreamer.

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37157

llvm-svn: 313374
2017-09-15 18:20:28 +00:00
Craig Topper 7a183e2760 [X86] Prefer VPERMQ over VPERM2F128 for any unary shuffle, not just the ones that can be done with a insertf128
The early out for AVX2 in lowerV2X128VectorShuffle is positioned in a weird spot below some shuffle mask equivalency checks.

But I think we want to allow VPERMQ for any unary shuffle.

Differential Revision: https://reviews.llvm.org/D37893

llvm-svn: 313373
2017-09-15 18:11:13 +00:00
Craig Topper e0d724cf51 [X86] Don't create i64 constants on 32-bit targets when lowering v64i1 constant build vectors
When handling a v64i1 build vector of constants on 32-bit targets we were creating an illegal i64 constant that we then bitcasted back to v64i1. We need to instead create two 32-bit constants, bitcast them to v32i1 and concat the result. We should also take care to handle the halves being all zeros/ones after the split.

This patch splits the build vector and then recursively lowers the two pieces. This allows us to handle the all ones and all zeros cases with minimal effort. Ideally we'd just do the split and concat, and let lowering get called again on the new nodes, but getNode has special handling for CONCAT_VECTORS that reassembles the pieces back into a single BUILD_VECTOR. Hopefully the two temporary BUILD_VECTORS we had to create to do this that don't get returned don't cause any issues.

Fixes PR34605.

Differential Revision: https://reviews.llvm.org/D37858

llvm-svn: 313366
2017-09-15 17:09:03 +00:00
Craig Topper 143797eb89 [X86] Add isel pattern infrastructure to begin recognizing when we're inserting 0s into the upper portions of a vector register and the producing instruction as already produced the zeros.
Currently if we're inserting 0s into the upper elements of a vector register we insert an explicit move of the smaller register to implicitly zero the upper bits. But if we can prove that they are already zero we can skip that. This is based on a similar idea of what we do to avoid emitting explicit zero extends for GR32->GR64.

Unfortunately, this is harder for vector registers because there are several opcodes that don't have VEX equivalent instructions, but can write to XMM registers. Among these are SHA instructions and a MMX->XMM move. Bitcasts can also get in the way.

So for now I'm starting with explicitly allowing only VPMADDWD because we emit zeros in combineLoopMAddPattern. So that is placing extra instruction into the reduction loop.

I'd like to allow PSADBW as well after D37453, but that's currently blocked by a bitcast. We either need to peek through bitcasts or canonicalize insert_subvectors with zeros to remove bitcasts on the value being inserted.

Longer term we should probably have a cleanup pass that removes superfluous zeroing moves even when the producer is in another basic block which is something these isel tricks can't do. See PR32544.

Differential Revision: https://reviews.llvm.org/D37653

llvm-svn: 313365
2017-09-15 17:09:00 +00:00
Anna Thomas f34537dff8 [RuntimeUnroll] Add heuristic for unrolling multi-exit loop
Add a profitability heuristic to enable runtime unrolling of multi-exit
loop: There can be atmost two unique exit blocks for the loop and the
second exit block should be a deoptimizing block. Also, there can be one
other exiting block other than the latch exiting block. The reason for
the latter is so that we limit the number of branches in the unrolled
code to being at most the unroll factor.  Deoptimizing blocks are rarely
taken so these additional number of branches created due to the
unrolling are predictable, since one of their target is the deopt block.

Reviewers: apilipenko, reames, evstupac, mkuper

Subscribers: llvm-commits

Reviewed by: reames

Differential Revision: https://reviews.llvm.org/D35380

llvm-svn: 313363
2017-09-15 15:56:05 +00:00
Anna Thomas 512dde77ba [RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup
During runtime unrolling on loops with multiple exits, we update the
exit blocks with the correct phi values from both original and remainder
loop.
In this process, we lookup the VMap for the mapped incoming phi values,
but did not update the VMap if a default entry was generated in the VMap
during the lookup. This default value is generated when constants or
values outside the current loop are looked up.
This patch fixes the assertion failure when null entries are present in
the VMap because of this lookup. Added a testcase that showcases the
problem.

llvm-svn: 313358
2017-09-15 13:29:33 +00:00
Simon Pilgrim 905e79c4dc [X86][SSE] Add test cases vector for integer multiplies
Mainly inspired by PR34474 / D37896

llvm-svn: 313353
2017-09-15 11:17:42 +00:00
Ilya Biryukov d23faa843e Revert "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
This reverts commit r313348.

Reason: it caused buildbot failures.
llvm-svn: 313352
2017-09-15 10:15:00 +00:00
Sjoerd Meijer 0c5ba21cbf [AArch64] allow v8f16 types when FullFP16 is supported
This adds support for allowing v8f16 vector types, thus avoiding conversions
from/to single precision for these types. This is a follow up patch of
commits r311154 and r312104, which added support for scalars and v4f16
types, respectively.

Differential Revision: https://reviews.llvm.org/D37802

llvm-svn: 313351
2017-09-15 09:24:48 +00:00
Dinar Temirbulatov e2358b53bc [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:

void add1(int * __restrict dst, const int * __restrict src) {
  *dst++ = *src++;
  *dst++ = *src++ + 1;
  *dst++ = *src++ + 2;
  *dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.

Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev, davide

Subscribers: llvm-commits, RKSimon

Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 313348
2017-09-15 06:56:39 +00:00
Jatin Bhateja 908c8b37c2 [X86] PR32755 : Improvement in CodeGen instruction selection for LEAs.
Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
          T1 = A + B
          T2 = T1 + 10
          T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
          Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
          leal 1(%rax,%rcx,1), %rdx
          leal 1(%rax,%rcx,2), %rcx
       will be factored as following
          leal 1(%rax,%rcx,1), %rdx
          leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet

Reviewed By: lsaba

Subscribers: spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

llvm-svn: 313343
2017-09-15 05:29:51 +00:00
Zachary Turner 83dcb68468 Revert "[lit] Force site configs to run before source-tree configs"
This patch is still breaking several multi-stage compiler-rt bots.
I already know what the fix is, but I want to get the bots green
for now and then try re-applying in the morning.

llvm-svn: 313335
2017-09-15 02:56:40 +00:00
Reid Kleckner 87288b98b6 [codeview] Use a type index of zero for static method "this" types
Otherwise VS won't show anything in the autos or watch window of static
methods.

llvm-svn: 313329
2017-09-15 00:59:07 +00:00
Zachary Turner 7f9c0ce1bb [lit] Revert "Add a lit.llvm module that all llvm projects can use"
This is breaking due to some changes I forgot to merge in, so I'm
temporarily reverting them until I can re-test that this works.

llvm-svn: 313328
2017-09-15 00:56:08 +00:00
Zachary Turner e5412b882a [lit] Add a lit.llvm module that all test suites can use.
To further reduce duplicate code, this patch introduces a module
that configs can simply import and get access to a lot of useful
functionality such as setting up paths, adding features that are
useful across all projects, and other utility-type functions.

For now this only updates llvm's suite to use this new library,
but subsequent patches will update other projects.

Differential Revision: https://reviews.llvm.org/D37778

llvm-svn: 313325
2017-09-15 00:34:00 +00:00
Sam Clegg 7c39594357 [WebAssembly] Use a separate wasm data segment for each global symbol
This is stepping stone towards honoring -fdata-sections
and letting the assembler decide how many wasm data
segments to create.

Differential Revision: https://reviews.llvm.org/D37834

llvm-svn: 313313
2017-09-14 23:07:53 +00:00
Matt Arsenault c317287fde AMDGPU: Fix violating constant bus restriction
You can't use madmk/madmk if it already uses an SGPR input.

llvm-svn: 313298
2017-09-14 20:54:29 +00:00
Guozhi Wei 21f8fad909 [TargetTransformInfo] Detect 0 latency instructions
For instructions that unlikely generate machine instructions, they should also have 0 latency.

Differential Revision: https://reviews.llvm.org/D37833

llvm-svn: 313288
2017-09-14 19:20:02 +00:00
Matt Arsenault 37ab4cf8b8 AMDGPU: Fix assert on alloca of array of struct
llvm-svn: 313282
2017-09-14 18:02:29 +00:00
Simon Dardis 6819a7a1cb [bpf] Fix test to always use little endian.
r313055 broke the big endian buildbots as the CHECK lines contained little
endian data but -triple bpf uses the host endian.

llvm-svn: 313281
2017-09-14 17:55:50 +00:00
Matt Arsenault defe371771 AMDGPU: Stop modifying SP in call sequences
Because the stack growth direction and addressing is done
in the same direction, modifying SP at the beginning of the
call sequence was incorrect. If we had a stack passed argument,
we would end up skipping that number of bytes before pushing
arguments, leaving unused/inconsistent space.

The callee creates fixed stack objects in its frame, so
the space necessary for these is already logically allocated
in the callee, so we just let the callee increment SP if
it really requires it.

llvm-svn: 313279
2017-09-14 17:37:40 +00:00
Dehao Chen 3a81f84d9a Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader.
Summary: SampleProfileLoader inlines hot functions if it is inlined in the profiled binary. However, the inline needs to be guarded by legality check, otherwise it could lead to correctness issues.

Reviewers: eraman, davidxl

Reviewed By: eraman

Subscribers: vitalybuka, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D37779

llvm-svn: 313277
2017-09-14 17:29:56 +00:00
Simon Dardis 55e446737f [mips] Implement the 'dext' aliases and it's disassembly alias.
The other members of the dext family of instructions (dextm, dextu) are
traditionally handled by the assembler selecting the right variant of
'dext' depending on the values of the position and size operands.

When these instructions are disassembled, rather than reporting the
actual instruction, an equivalent aliased form of 'dext' is generated
and is reported. This is to mimic the behaviour of binutils.

Reviewers: slthakur, nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D34887

llvm-svn: 313276
2017-09-14 17:27:53 +00:00
Adrian Prantl 82959cd5d8 Adapt more testcases for llvm-dwarfdump changes.
llvm-svn: 313275
2017-09-14 17:27:03 +00:00
Matt Arsenault 6efd082c01 AMDGPU: Make frame register caller preserved
Using SplitCSR for the frame register was very broken. Often
the copies in the prolog and epilog were optimized out, in addition
to them being inserted after the true prolog where the FP
was clobbered.

I have a hacky solution which works that continues to use
split CSR, but for now this is simpler and will get to working
programs.

llvm-svn: 313274
2017-09-14 17:14:57 +00:00
Adrian Prantl 5fd3d49bc4 llvm-dwarfdump: support dumping static archives.
llvm-svn: 313272
2017-09-14 17:01:53 +00:00
Krzysztof Parzyszek 779d98e1c0 TableGen support for parameterized register class information
This replaces TableGen's type inference to operate on parameterized
types instead of MVTs, and as a consequence, some interfaces have
changed:
- Uses of MVTs are replaced by ValueTypeByHwMode.
- EEVT::TypeSet is replaced by TypeSetByHwMode.

This affects the way that types and type sets are printed, and the
tests relying on that have been updated.

There are certain users of the inferred types outside of TableGen
itself, namely FastISel and GlobalISel. For those users, the way
that the types are accessed have changed. For typical scenarios,
these replacements can be used:
- TreePatternNode::getType(ResNo) -> getSimpleType(ResNo)
- TreePatternNode::hasTypeSet(ResNo) -> hasConcreteType(ResNo)
- TypeSet::isConcrete -> TypeSetByHwMode::isValueTypeByHwMode(false)

For more information, please refer to the review page.

Differential Revision: https://reviews.llvm.org/D31951

llvm-svn: 313271
2017-09-14 16:56:21 +00:00
Zachary Turner a0e55b6403 [lit] Force site configs to be run before source-tree configs
This patch simplifies LLVM's lit infrastructure by enforcing an ordering
that a site config is always run before a source-tree config.

A significant amount of the complexity from lit config files arises from
the fact that inside of a source-tree config file, we don't yet know if
the site config has been run.  However it is *always* required to run
a site config first, because it passes various variables down through
CMake that the main config depends on.  As a result, every config
file has to do a bunch of magic to try to reverse-engineer the location
of the site config file if they detect (heuristically) that the site
config file has not yet been run.

This patch solves the problem by emitting a mapping from source tree
config file to binary tree site config file in llvm-lit.py. Then, during
discovery when we find a config file, we check to see if we have a
target mapping for it, and if so we use that instead.

This mechanism is generic enough that it does not affect external users
of lit. They will just not have a config mapping defined, and everything
will work as normal.

On the other hand, for us it allows us to make many simplifications:

* We are guaranteed that a site config will be executed first
* Inside of a main config, we no longer have to assume that attributes
  might not be present and use getattr everywhere.
* We no longer have to pass parameters such as --param llvm_site_config=<path>
  on the command line.
* It is future-proof, meaning you don't have to edit llvm-lit.in to add
  support for new projects.
* All of the duplicated logic of trying various fallback mechanisms of
  finding a site config from the main config are now gone.

One potentially noteworthy thing that was required to implement this
change is that whereas the ninja check targets previously used the first
method to spawn lit, they now use the second. In particular, you can no
longer run lit.py against the source tree while specifying the various
`foo_site_config=<path>` parameters.  Instead, you need to run
llvm-lit.py.

Differential Revision: https://reviews.llvm.org/D37756

llvm-svn: 313270
2017-09-14 16:47:58 +00:00
Krzysztof Parzyszek 6ca02b25a7 [IfConversion] More simple, correct dead/kill liveness handling
Patch by Jesper Antonsson.

Differential Revision: https://reviews.llvm.org/D37611

llvm-svn: 313268
2017-09-14 15:53:11 +00:00
Simon Dardis 6f83ae38a3 [mips] Implement the 'dins' aliases.
Traditionally GAS has provided automatic selection between dins, dinsm and
dinsu. Binutils also disassembles all instructions in that family as 'dins'
rather than the actual instruction.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D34877

llvm-svn: 313267
2017-09-14 15:17:50 +00:00
Sanjay Patel 0d4fd5b668 [InstSimplify] fold sdiv/srem based on compare of dividend and divisor
This should bring signed div/rem analysis up to the same level as unsigned. 
We use icmp simplification to determine when the divisor is known greater than the dividend.

Each positive test is followed by a negative test to show that we're not overstepping the boundaries of the known bits.
There are extra tests for the signed-min-value special cases.

Alive proofs:
http://rise4fun.com/Alive/WI5

Differential Revision: https://reviews.llvm.org/D37713

llvm-svn: 313264
2017-09-14 14:59:07 +00:00
Chad Rosier 4d6d74e236 Add newline to end of test file. NFC.
llvm-svn: 313263
2017-09-14 14:48:59 +00:00
Simon Pilgrim 0b220c7524 [X86] Regenerate test. NFCI.
llvm-svn: 313259
2017-09-14 13:00:27 +00:00
Simon Pilgrim 47d8f62472 Regenerate test (broadcast comment). NFCI.
llvm-svn: 313258
2017-09-14 12:41:19 +00:00
Ayman Musa ab68449c53 [X86] When applying the shuffle-to-zero-extend transformation on floating point, bitcast to integer first.
Fix issue described in PR34577.

Differential Revision: https://reviews.llvm.org/D37803

llvm-svn: 313256
2017-09-14 12:06:38 +00:00
Simon Dardis 28365b33ad [mips] Pick the right variant of DINS upfront and enable target instruction verification
This patch complements D16810 "[mips] Make isel select the correct DEXT variant
up front.". Now ISel picks the right variant of DINS, so now there is no need
to replace DINS with the appropriate variant during
MipsMCCodeEmitter::encodeInstruction().

This patch also enables target specific instruction verification for ins, dins,
dinsm, dinsu, ext, dext, dextm, dextu. These instructions have constraints that
are checked when generating MipsISD::Ins and MipsISD::Ext nodes, but these
constraints are not checked during instruction selection. Adding machine
verification should catch outstanding cases.

Finally, correct a bug that instruction verification uncovered, where the
position operand of a DINSU generated during lowering was being silently
and accidently corrected to the correct value.

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D34809

llvm-svn: 313254
2017-09-14 10:58:00 +00:00
Simon Pilgrim 8bd2d8780a [DAGCombine] (shl (or x, c1), c2) -> (or (shl x, c2), c1 << c2)
We already have a combine for this pattern when the input to shl is add, so we just need to enable the transformation when the input is or.

Original patch by @tstellar

Differential Revision: https://reviews.llvm.org/D19325

llvm-svn: 313251
2017-09-14 10:38:30 +00:00
Simon Pilgrim 337b2d007a Fix line endings. NFCI.
llvm-svn: 313247
2017-09-14 10:30:54 +00:00
Simon Pilgrim 11e2969a35 Fix line endings. NFCI.
llvm-svn: 313246
2017-09-14 10:30:22 +00:00
Dean Michael Berris d7b3add7c4 [XRay][DebugInfo] Update the test to use a specific target
Follow-up to D37791.

llvm-svn: 313243
2017-09-14 09:58:25 +00:00
Chandler Carruth 7376ae88eb [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle
invalidated SCCs even when we do not have an updated SCC to redirect
towards.

This comes up in a fairly subtle and surprising circumstance: we need to
have a connected but internal node in the call graph which later becomes
a disconnected island, and then gets deleted. All of this needs to
happen mid-CGSCC walk. Because it is disconnected, we have no way of
computing a new "current" SCC when it gets deleted. Instead, we need to
explicitly check for a deleted "current" SCC and bail out of the current
CGSCC step. This will bubble all the way up to the post-order walk and
then resume correctly.

I've included minimal tests for this bug. The specific behavior
matches something we've seen in the wild with the new PM combined with
ThinLTO and sample PGO, but I've not yet confirmed whether this is the
only issue there.

llvm-svn: 313242
2017-09-14 08:33:57 +00:00
Dean Michael Berris b21f580ddc [XRay][DebugInfo] Remove -debug-compile from test invocation of llc
This breaks bootstrap builds, and is actually unnecessary. Tested
locally and it seems we can remove -debug-comile just fine.

Follow-up to D37791.

llvm-svn: 313238
2017-09-14 07:54:54 +00:00
Alon Kom 682cfc1d4c [LV] Fix maximum legal VF calculation
This patch fixes pr34283, which exposed that the computation of
maximum legal width for vectorization was wrong, because it relied
on MaxInterleaveFactor to obtain the maximum stride used in the loop,
however not all strided accesses in the loop have an interleave-group
associated with them.
Instead of recording the maximum stride in the loop, which can be over
conservative (e.g. if the access with the maximum stride is not involved
in the dependence limitation), this patch tracks the actual maximum legal
width imposed by accesses that are involved in dependencies.

Differential Revision: https://reviews.llvm.org/D37507

llvm-svn: 313237
2017-09-14 07:40:02 +00:00
Dean Michael Berris 01fd7c8bd4 [XRay][CodeGen] Use the current function symbol as the associated symbol for the instrumentation map
Summary:
XRay had been assuming that the previous section is the "text" section
of the function when lowering the instrumentation map. Unfortunately
this is not a safe assumption, because we may be coming from lowering
debug type information for the function being lowered.

This fixes an issue with combining -gsplit-dwarf, -generate-type-units,
-debug-compile and -fxray-instrument for sole member functions. When the
split dwarf section is stripped, we're left with references from the
xray_instr_map to the debug section. The change now uses the function's
symbol instead of the previous section's start symbol.

We found the bug while attempting to strip the split debug sections off
an XRay-instrumented object file, which had a peculiar edge-case for
single-function classes where the single function is being lowered.
Because XRay had assocaited the instrumentation map for a function to
the debug types section instead of the function's section, the objcopy
call will fail due to the misplaced reference from the xray_instr_map
section.

Reviewers: pcc, dblaikie, echristo

Subscribers: llvm-commits, aprantl

Differential Revision: https://reviews.llvm.org/D37791

llvm-svn: 313233
2017-09-14 07:08:23 +00:00
Vitaly Buka 48624d327a Revert "Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader."
Patch introduced uninitialized value.

This reverts commit r313195.

llvm-svn: 313230
2017-09-14 05:40:33 +00:00
Peter Collingbourne cfbd089237 Reland r313157, "ThinLTO: Correctly follow aliasee references when dead stripping." which was reverted in r313222.
This reland includes a fix for the LowerTypeTests pass so that it
looks past aliases when determining which type identifiers are live.

Differential Revision: https://reviews.llvm.org/D37842

llvm-svn: 313229
2017-09-14 05:02:59 +00:00
Hans Wennborg ae050afeb9 Revert r313157 "ThinLTO: Correctly follow aliasee references when dead stripping."
This broke Chromium's CFI build; see crbug.com/765004.

> We were previously handling aliases during dead stripping by adding
> the aliased global's "original name" GUID to the worklist. This will
> lead to incorrect behaviour if the global has local linkage because
> the original name GUID will not correspond to the global's GUID in
> the summary.
>
> Because an alias is just another name for the global that it
> references, there is no need to mark the referenced global as used,
> or to follow references from any other copies of the global. So all
> we need to do is to follow references from the aliasee's summary
> instead of the alias.
>
> Differential Revision: https://reviews.llvm.org/D37789

llvm-svn: 313222
2017-09-14 00:40:14 +00:00
NAKAMURA Takumi 38fac5905e Move llvm/test/CodeGen/X86/clear-liverange-spillreg.mir to SystemZ. It was in wrong place.
llvm-svn: 313218
2017-09-14 00:03:23 +00:00
Matt Arsenault ecb43ef1bc AMDGPU: Don't spill SP reg like a normal CSR
llvm-svn: 313217
2017-09-13 23:47:01 +00:00
Hans Wennborg 06e2a384c2 Revert r312719 "[MachineCombiner] Update instruction depths incrementally for large BBs."
This caused PR34596.

> [MachineCombiner] Update instruction depths incrementally for large BBs.
>
> Summary:
> For large basic blocks with lots of combinable instructions, the
> MachineTraceMetrics computations in MachineCombiner can dominate the compile
> time, as computing the trace information is quadratic in the number of
> instructions in a BB and it's relevant successors/predecessors.
>
> In most cases, knowing the instruction depth should be enough to make
> combination decisions. As we already iterate over all instructions in a basic
> block, the instruction depth can be computed incrementally. This reduces the
> cost of machine-combine drastically in cases where lots of instructions
> are combined. The major drawback is that AFAIK, computing the critical path
> length cannot be done incrementally. Therefore we only compute
> instruction depths incrementally, for basic blocks with more
> instructions than inc_threshold. The -machine-combiner-inc-threshold
> option can be used to set the threshold and allows for easier
> experimenting and checking if using incremental updates for all basic
> blocks has any impact on the performance.
>
> Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
>
> Reviewed By: fhahn
>
> Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D36619

llvm-svn: 313213
2017-09-13 23:23:09 +00:00
Adrian Prantl c113801489 Update testcase that was XFAILed on Darwin for llvm-dwarfdump change.
llvm-svn: 313209
2017-09-13 22:30:43 +00:00
Stanislav Mekhanoshin 7fe9a5d9b4 Allow target to decide when to cluster loads/stores in misched
MachineScheduler when clustering loads or stores checks if base
pointers point to the same memory. This check is done through
comparison of base registers of two memory instructions. This
works fine when instructions have separate offset operand. If
they require a full calculated pointer such instructions can
never be clustered according to such logic.

Changed shouldClusterMemOps to accept base registers as well and
let it decide what to do about it.

Differential Revision: https://reviews.llvm.org/D37698

llvm-svn: 313208
2017-09-13 22:20:47 +00:00
Adrian Prantl 3ae35eb56b llvm-dwarfdump: automatically dump both regular and .dwo variant of sections
Since users typically don't really care about the .dwo / non.dwo
distinction, this patch makes it so dwarfdump --debug-<info,...> dumps
.debug_info and (if available) also .debug_info.dwo. This simplifies
the command line interface (I've removed all dwo-specific dump
options) and makes the tool friendlier to use.

Differential Revision: https://reviews.llvm.org/D37771

llvm-svn: 313207
2017-09-13 22:09:01 +00:00
Reid Kleckner 89af112cf5 [codeview] VLAs and unsized arrays should use a size of zero
Previously we used a size of '1' for VLAs because we weren't sure what
MSVC did. However, MSVC does support declaring an array without a size,
for which it emits an array type with a size of zero. Clang emits the
same DI metadata for VLAs and arrays without bound, so we would describe
arrays without bound as having one element. This lead to Microsoft
debuggers only printing a single element.

Emitting a size of zero appears to cause these debuggers to search the
symbol information to find a definition of the variable with accurate
array bounds.

Fixes http://crbug.com/763580

llvm-svn: 313203
2017-09-13 21:54:20 +00:00
Dehao Chen be707f2930 Update the early_inline test to explicitly add attribute for all functions. (NFC)
llvm-svn: 313202
2017-09-13 21:49:36 +00:00
Wei Mi a2a135a01c Add a comment for the test. NFC.
llvm-svn: 313199
2017-09-13 21:47:13 +00:00
Wei Mi c0d066468e [RegAlloc] Keep a copy of live interval for the spilled vregs in HoistSpillHelper.
This is to fix PR34502. After rL311401, the live range of spilled vreg will be
cleared. HoistSpill need to use the live range of the original vreg before splitting
to know the moving range of the spills. The patch saves a copy of live interval for
the spilled vreg inside of HoistSpillHelper.

Differential Revision: https://reviews.llvm.org/D37578

llvm-svn: 313197
2017-09-13 21:41:30 +00:00
Vedant Kumar 4c76c45fed [Bitcode] Add a compatibility test for 5.0.0 bitcode
llvm-svn: 313196
2017-09-13 21:40:59 +00:00