Commit Graph

143714 Commits

Author SHA1 Message Date
Ahmed Bougacha 452dc8ea7d [GlobalISel] Rename TargetGlobalISel.td to GISel/SelectionDAGCompat.td
llvm-svn: 293009
2017-01-25 02:41:26 +00:00
Greg Parker 17db7704cd Reinstate "r292904 - [lit] Allow boolean expressions in REQUIRES and XFAIL
and UNSUPPORTED"

This reverts the revert in r292942.

llvm-svn: 293007
2017-01-25 02:26:03 +00:00
Gor Nishanov df3d71a7a9 [coroutines] Spill the result of the invoke instruction correctly
Summary:
When we decide that the result of the invoke instruction need to be spilled, we need to insert the spill into a block that is on the normal edge coming out of the invoke instruction. (Prior to this change the code would insert the spill immediately after the invoke instruction, which breaks the IR, since invoke is a terminator instruction).

In the following example, we will split the edge going into %cont and insert the spill there.

```
  %r = invoke double @print(double 0.0) to label %cont unwind label %pad

  cont:
    %0 = call i8 @llvm.coro.suspend(token none, i1 false)
    switch i8 %0, label %suspend [i8 0, label %resume
                                  i8 1, label %cleanup]
  resume:
    call double @print(double %r)
```

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: mehdi_amini, llvm-commits, EricWF

Differential Revision: https://reviews.llvm.org/D29102

llvm-svn: 293006
2017-01-25 02:25:54 +00:00
Tom Stellard 2f3f9855f0 AMDGPU add support for spilling to a user sgpr pointed buffers
Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].

Patch By: Dave Airlie

Reviewers: nhaehnle, arsenm, tstellarAMD

Reviewed By: arsenm

Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25428

llvm-svn: 293000
2017-01-25 01:25:13 +00:00
Eugene Zelenko 11f6907f40 [AArch64] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292996
2017-01-25 00:29:26 +00:00
Justin Bogner a029531e10 GlobalISel: Use the correct types when translating landingpad instructions
There was a bug here where we were using p0 instead of s32 for the
selector type in the landingpad. Instead of hardcoding these types we
should get the types from the landingpad instruction directly.

Note that we replicate an assert from SDAG here to only support
two-valued landingpads.

llvm-svn: 292995
2017-01-25 00:16:53 +00:00
Kevin Enderby 7a165755ba Fix llvm-objdump so it picks a good CPU based for Mach-O files
for CPU_SUBTYPE_ARM_V7S and CPU_SUBTYPE_ARM_V7K.

For these two cpusubtypes they should default to a cortex-a7 CPU
to give proper disassembly without a -mcpu= flag.

rdar://27431703

llvm-svn: 292993
2017-01-24 23:41:04 +00:00
Eugene Zelenko 8c6ed0f3a0 [XCore] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292988
2017-01-24 23:02:48 +00:00
Matt Arsenault bf67cf7e4b AMDGPU: Remove spurious out branches after a kill
The sequence like this:
  v_cmpx_le_f32_e32 vcc, 0, v0
  s_branch BB0_30
  s_cbranch_execnz BB0_30
  ; BB#29:
  exp null off, off, off, off done vm
  s_endpgm
  BB0_30:
  ; %endif110

is likely wrong. The s_branch instruction will unconditionally jump
to BB0_30 and the skip block (exp done + endpgm) inserted for
performing the kill instruction will never be executed. This results
in a GPU hang with Star Ruler 2.

The s_branch instruction is added during the "Control Flow Optimizer"
pass which seems to re-organize the basic blocks, and we assume
that SI_KILL_TERMINATOR is always the last instruction inside a
basic block. Thus, after inserting a skip block we just go to the
next BB without looking at the subsequent instructions after the
kill, and the s_branch op is never removed.

Instead, we should remove the unconditional out branches and let
skip the two instructions if the exec mask is non-zero.

This patch fixes the GPU hang and doesn't introduce any regressions
with "make check".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99019

Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>

llvm-svn: 292985
2017-01-24 22:18:39 +00:00
Wei Mi f1cf0278e8 Revert rL292621. Caused some internal build bot failures in apple.
llvm-svn: 292984
2017-01-24 22:15:06 +00:00
Eugene Zelenko 3943d2b0d7 [SystemZ] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 292983
2017-01-24 22:10:43 +00:00
Matt Arsenault 7aad8fd8f4 Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@miletic.net>

llvm-svn: 292982
2017-01-24 22:02:15 +00:00
Dehao Chen a5eb1689dc Explicitly promote indirect calls before sample profile annotation.
Summary: In iterative sample pgo where profile is collected from PGOed binary, we may see indirect call targets promoted and inlined in the profile. Before profile annotation, we need to make this happen in order to annotate correctly on IR. This patch explicitly promotes these indirect calls and inlines them before profile annotation.

Reviewers: xur, davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29040

llvm-svn: 292979
2017-01-24 21:05:51 +00:00
Saleem Abdulrasool 85824ee618 Demangle: correct demangling for CV-qualified functions
When demangling a CV-qualified function type with a final reference type
parameter, we would treat the reference type parameter as a r-value ref
accidentally.  This would result in the improper decoration of the
function type itself.

Resolves PR31741!

llvm-svn: 292976
2017-01-24 20:04:58 +00:00
Saleem Abdulrasool 25ee0a62ac Demangle: use named values for CV qualifiers
Rather than hard-coding magic values of 1, 2, 4 (bit-field), use an enum
to name the values.  NFC.

llvm-svn: 292975
2017-01-24 20:04:56 +00:00
Ivan Krasin 34e89ad0a4 Revert [AMDGPU][mc][tests][NFC] Add coverage/smoke tests for Gfx7 and Gfx8.
Reason: broke ASAN bots with a global buffer overflow.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/2291

Each test contains 20-30K test cases but takes only several (from 4 to 10)
seconds to complete on average machine. The tests cover the majority of
AMDGPU Gfx7/Gfx8 instructions, including many dark corners, and intended
to quickly find out if something is broken.

llvm-svn: 292974
2017-01-24 19:58:59 +00:00
Daniel Berlin 390dfde0f3 Remove the load hoisting code of MLSM, it is completely subsumed by GVNHoist
Summary:
GVNHoist performs all the optimizations that MLSM does to loads, in a
more general way, and in a faster time bound (MLSM is N^3 in most
cases, N^4 in a few edge cases).

This disables the load portion.

Note that the way ld_hoist_st_sink.ll is written makes one think that
the loads should be moved to the while.preheader block, but

1. Neither MLSM nor GVNHoist do it (they both move them to identical places).

2. MLSM couldn't possibly do it anyway, as the while.preheader block
is not the head of the diamond, while.body is.  (GVNHoist could do it
if it was legal).

3. At a glance, it's not legal anyway because the in-loop load
conflict with the in-loop store, so the loads must stay in-loop.

I am happy to update the test to use update_test_checks so that
checking is tighter, just was going to do it as a followup.

Note that i can find no particular benefit to the store portion on any
real testcase/benchmark i have (even size-wise).  If we really still
want it, i am happy to commit to writing a targeted store sinker, just
taking the code from the MemorySSA port of MergedLoadStoreMotion
(which is N^2 worst case, and N most of the time).

We can do what it does in a much better time bound.

We also should be both hoisting and sinking stores, not just sinking
them, anyway, since whether we should hoist or sink to merge depends
basically on luck of the draw of where the blockers are placed.

Nonetheless, i have left it alone for now.

Reviewers: chandlerc, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29079

llvm-svn: 292971
2017-01-24 19:55:36 +00:00
Changpeng Fang c85abbd955 AMDGPU/SI: Give up in promote alloca when a pointer may be captured.
Differential Revision:
  http://reviews.llvm.org/D28970

Reviewer:
  Matt

llvm-svn: 292966
2017-01-24 19:06:28 +00:00
Saleem Abdulrasool c38cd326fc Demangle: avoid butchering parameter type
When demangling a CV-qualified function type with a final parameter with
a reference type, we would insert the CV qualification on the parameter
rather than the function, and in the process adjust the insertion point
by one extra, splitting the type name.  This avoids doing so, even
though the attribution is still incorrect.

llvm-svn: 292965
2017-01-24 18:52:19 +00:00
Chad Rosier 8e11fbd15d [AArch64] Fix typo. NFC.
llvm-svn: 292959
2017-01-24 18:08:10 +00:00
Amaury Sechet d90f5f6698 Use InstCombine's builder in foldSelectCttzCtlz instead of creating a new one.
Summary: As per title. This will add the instructiions we are interested in in the worklist.

Reviewers: mehdi_amini, majnemer, andreadb

Differential Revision: https://reviews.llvm.org/D29081

llvm-svn: 292957
2017-01-24 17:48:25 +00:00
Stanislav Mekhanoshin 22a56f2f5a [AMDGPU] Add VGPR copies post regalloc fix pass
Regalloc creates COPY instructions which do not formally use VALU.
That results in v_mov instructions displaced after exec mask modification.
One pass which do it is SIOptimizeExecMasking, but potentially it can be
done by other passes too.

This patch adds a pass immediately after regalloc to add implicit exec
use operand to all VGPR copy instructions.

Differential Revision: https://reviews.llvm.org/D28874

llvm-svn: 292956
2017-01-24 17:46:17 +00:00
Evandro Menezes 7784cacd91 [AArch64] Rename 'no-quad-ldst-pairs' to 'slow-paired-128'
In order to follow the pattern of the existing 'slow-misaligned-128store'
option, rename the option 'no-quad-ldst-pairs' to 'slow-paired-128'.

llvm-svn: 292954
2017-01-24 17:34:31 +00:00
Chris Bieneman bef847c3ae [Lanai] Rename LanaiInstPrinter library to LanaiAsmPrinter
Summary:
    This is in keeping with LLVM convention. The classes are InstPrinters, but the library is ${target}AsmPrinter.

This patch is in response to bryant pointing out to me that Lanai was the only backend deviating from convention here. Thanks!

Reviewers: jpienaar, bryant

Subscribers: mgorny, jgosnell, llvm-commits

Differential Revision: https://reviews.llvm.org/D29043

llvm-svn: 292953
2017-01-24 17:27:01 +00:00
Sanjay Patel 562272536a [InstSimplify] try to eliminate icmp Pred (add nsw X, C1), C2
I was surprised to see that we're missing icmp folds based on 'add nsw' in InstCombine, 
but we should handle the InstSimplify cases first because that could make the InstCombine
code simpler.

Here are Alive-based proofs for the logic:

Name: add_neg_constant
Pre: C1 < 0 && (C2 > ((1<<(width(C1)-1)) + C1))
%a = add nsw i7 %x, C1
%b = icmp sgt %a, C2
  =>
%b = false

Name: add_pos_constant
Pre: C1 > 0 && (C2 < ((1<<(width(C1)-1)) + C1 - 1))
%a = add nsw i6 %x, C1
%b = icmp slt %a, C2
  =>
%b = false

Name: nuw
Pre: C1 u>= C2
%a = add nuw i11 %x, C1
%b = icmp ult %a, C2
  =>
%b = false

Differential Revision: https://reviews.llvm.org/D29053

llvm-svn: 292952
2017-01-24 17:03:24 +00:00
Simon Pilgrim 5bdb95c078 [X86][AVX2] Regenerate test.
llvm-svn: 292950
2017-01-24 16:58:22 +00:00
Reid Kleckner 11cf053bd1 [CodeView] Fix off-by-one error in def range gap emission
Also fixes a much worse bug where we emitted the wrong gap size for the
def range uncovered by the test for this issue.

Fixes PR31726.

llvm-svn: 292949
2017-01-24 16:57:55 +00:00
Simon Pilgrim 1b80a14685 [X86][AVX2] Removed FIXME comment and regenerated test.
The comment talked about replacing vpmovzxwd+vpslld+vpsrad with vpmovsxwd - which isn't valid as we're sign extending a <8 x i1> bool vector not an all/nobits <8 x i16>

llvm-svn: 292948
2017-01-24 16:56:23 +00:00
Simon Pilgrim fa6afd1bc6 [X86][AVX2] Cleaned up test triple and regenerated tests.
llvm-svn: 292946
2017-01-24 16:53:09 +00:00
Geoff Berry 92a286ae5a [SelectionDAG] Handle inverted conditions when splitting into multiple branches.
Summary:
When conditional branches with complex conditions are split into
multiple branches in SelectionDAGBuilder::FindMergedConditions, also
handle inverted conditions.  These may sometimes appear without having
been optimized by InstCombine when CodeGenPrepare decides to sink and
duplicate cmp instructions, causing them to have only one use.  This
problem can be increased by e.g. GVNHoist hiding more cmps from
InstCombine by combining equivalent cmps from different blocks.

For example codegen X & !(Y | Z) as:
    jmp_if_X TmpBB
    jmp FBB
  TmpBB:
    jmp_if_notY Tmp2BB
    jmp FBB
  Tmp2BB:
    jmp_if_notZ TBB
    jmp FBB

Reviewers: bogner, MatzeB, qcolombet

Subscribers: llvm-commits, hiraditya, mcrosier, sebpop

Differential Revision: https://reviews.llvm.org/D28380

llvm-svn: 292944
2017-01-24 16:36:07 +00:00
Alex Lorenz 9111cc217d Revert "r292904 - [lit] Allow boolean expressions in REQUIRES and XFAIL
and UNSUPPORTED"

After r292904 llvm-lit fails to emit the test results in the XML format for
Apple's internal buildbots.

rdar://30164800

llvm-svn: 292942
2017-01-24 16:17:04 +00:00
Simon Pilgrim 893d2119ee [X86][AVX512] Remove unused argument from PMOVX tablegen patterns. NFCI.
Seems to be a copy+paste legacy from the AVX2 patterns.

llvm-svn: 292941
2017-01-24 16:16:29 +00:00
Amaury Sechet 5da456e6a1 Fix formating in foldSelectCttzCtlz. NFC
llvm-svn: 292934
2017-01-24 14:22:27 +00:00
Jonas Paulsson d9ae93ac9e Improve comment for ISD::EXTRACT_VECTOR_ELT
The comment in ISDOpcodes.h for EXTRACT_VECTOR_ELT now explains that the high
bits are undefined if the result is extended.

Review: Hal Finkel
llvm-svn: 292933
2017-01-24 14:21:29 +00:00
Chandler Carruth 6acdca78a0 [PH] Replace uses of AssertingVH from members of analysis results with
a lazy-asserting PoisoningVH.

AssertVH is fundamentally incompatible with cache-invalidation of
analysis results. The invaliadtion happens after the AssertingVH has
already fired. Instead, use a PoisoningVH that will assert if the
dangling handle is ever used rather than merely be assigned or
destroyed.

This patch also removes all of the (numerous) doomed attempts to work
around this fundamental incompatibility. It is a pretty significant
simplification IMO.

The most interesting change is in the Inliner where we still do some
clearing because we don't want to rely on the coarse grained
invalidation strategy of the containing pass manager. However, I prefer
the approach that contains this logic to the cleanup phase of the
Inliner, and I think we could enhance the CGSCC analysis management
layer to make this even better in the future if desired.

The rest is straight cleanup.

I've also added a test for one of the harder cases to work around: when
a *module analysis* contains many AssertingVHes pointing at functions.

Differential Revision: https://reviews.llvm.org/D29006

llvm-svn: 292928
2017-01-24 12:55:57 +00:00
Chandler Carruth 942c31474f [PM] Introduce a PoisoningVH as a (more expensive) alternative to
AssertingVH that delays any reported error until the handle is *used*.

This allows data structures to contain handles which become dangling
provided the data structure is cleaned up afterward rather than used for
anything interesting.

The implementation is moderately horrible in part because it works to
leave AssertingVH in place, undisturbed. If at some point there is
consensus that this is simply how AssertingVH should be used, it can be
substantially simplified.

This remains a boring pointer in a non-asserts build as you would
expect. The only place we pay cost is in asserts builds.

I plan to use this as a basis for replacing the asserting VHs that
currently dangle in the new PM until invalidation occurs in both LVI and
SCEV.

Differential Revision: https://reviews.llvm.org/D29061

llvm-svn: 292925
2017-01-24 12:34:47 +00:00
Martin Bohme 526299c81c [X86][SSE] Add explicit braces to avoid -Wdangling-else warning.
Reviewers: RKSimon

Subscribers: llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D29076

llvm-svn: 292924
2017-01-24 12:31:30 +00:00
Artem Tamazov 819da50d12 [AMDGPU][mc][tests][NFC] Add coverage/smoke tests for Gfx7 and Gfx8.
Each test contains 20-30K test cases but takes only several (from 4 to 10)
seconds to complete on average machine. The tests cover the majority of
AMDGPU Gfx7/Gfx8 instructions, including many dark corners, and intended
to quickly find out if something is broken.

llvm-svn: 292922
2017-01-24 12:22:01 +00:00
Simon Pilgrim 0c45338961 Fix unused variable warning
llvm-svn: 292921
2017-01-24 11:54:27 +00:00
Simon Pilgrim e1ec9072f6 [X86][SSE] Add support for constant folding vector arithmetic shift by immediates
llvm-svn: 292919
2017-01-24 11:46:13 +00:00
Pavel Labath 504400977a Fix fs::set_current_path unit test
The test fails when there is a symlink on the path because then the path
returned by current_path will not match the one we have set. Instead of
doing a string match check the unique id of the two files.

llvm-svn: 292916
2017-01-24 11:35:26 +00:00
Simon Pilgrim 6340e54861 [X86][SSE] Add support for constant folding vector logical shift by immediates
llvm-svn: 292915
2017-01-24 11:21:57 +00:00
Simon Pilgrim 78f8630ac0 [InstCombine][X86] MULDQ/MULUDQ undef -> zero
Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now

Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst

llvm-svn: 292913
2017-01-24 11:07:41 +00:00
Pavel Labath f726dfa65e [Support] Use O_CLOEXEC only when declared
Summary:
Use the O_CLOEXEC flag only when it is available. Some old systems (e.g.
SLES10) do not support this flag. POSIX explicitly guarantees that this
flag can be checked for using #if, so there is no need for a CMake
check.

In case O_CLOEXEC is not supported, fall back to fcntl(FD_CLOEXEC)
instead.

Reviewers: rnk, rafael, mgorny

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28894

llvm-svn: 292912
2017-01-24 10:57:01 +00:00
Alexey Bataev 992ac2d5c2 [SLP] Additional test for checking that instruction with extra args is
not reconstructed.

llvm-svn: 292911
2017-01-24 10:44:00 +00:00
Pavel Labath 2f0960970f [Support] Add sys::fs::set_current_path() (aka chdir)
Summary:
This adds a cross-platform way of setting the current working directory
analogous to the existing current_path() function used for retrieving
it. The function will be used in lldb.

Reviewers: rafael, silvas, zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29035

llvm-svn: 292907
2017-01-24 10:32:03 +00:00
Greg Parker ed0a95cbec [lit] Allow boolean expressions in REQUIRES and XFAIL and UNSUPPORTED
A `lit` condition line is now a comma-separated list of boolean expressions. 
Comma-separated expressions act as if each expression were on its own 
condition line:
For REQUIRES, if every expression is true then the test will run. 
For UNSUPPORTED, if every expression is false then the test will run. 
For XFAIL, if every expression is false then the test is expected to succeed. 
As a special case "XFAIL: *" expects the test to fail.

Examples:
# Test is expected fail on 64-bit Apple simulators and pass everywhere else
XFAIL: x86_64 && apple && !macosx
# Test is unsupported on Windows and on non-Ubuntu Linux 
# and supported everywhere else
UNSUPPORTED: linux && !ubuntu, system-windows

Syntax: 
* '&&', '||', '!', '(', ')'. 'true' is true. 'false' is false.
* Each test feature is a true identifier. 
* Substrings of the target triple are true identifiers for UNSUPPORTED 
 and XFAIL, but not for REQUIRES. (This matches the current behavior.)
* All other identifiers are false.
* Identifiers are [-+=._a-zA-Z0-9]+

Differential Revision: https://reviews.llvm.org/D18185

llvm-svn: 292904
2017-01-24 09:58:02 +00:00
Greg Parker d972882f06 Revert "[lit] Allow boolean expressions in REQUIRES and XFAIL and UNSUPPORTED"
This change needs to be better-coordinated with libc++.

llvm-svn: 292900
2017-01-24 08:58:20 +00:00
Alexey Bataev 9f8bb384af [SLP] Refactoring of HorizontalReduction class, NFC.
Removed data members ReduxWidth and MinVecRegSize + some C++11 stylish
improvements.

Differential Revision: https://reviews.llvm.org/D29010

llvm-svn: 292899
2017-01-24 08:57:17 +00:00
Greg Parker 2ab45201e7 [lit] Allow boolean expressions in REQUIRES and XFAIL and UNSUPPORTED
A `lit` condition line is now a comma-separated list of boolean expressions. 
Comma-separated expressions act as if each expression were on its own 
condition line:
For REQUIRES, if every expression is true then the test will run. 
For UNSUPPORTED, if every expression is false then the test will run. 
For XFAIL, if every expression is false then the test is expected to succeed. 
As a special case "XFAIL: *" expects the test to fail.

Examples:
# Test is expected fail on 64-bit Apple simulators and pass everywhere else
XFAIL: x86_64 && apple && !macosx
# Test is unsupported on Windows and on non-Ubuntu Linux 
# and supported everywhere else
UNSUPPORTED: linux && !ubuntu, system-windows

Syntax: 
* '&&', '||', '!', '(', ')'. 'true' is true. 'false' is false.
* Each test feature is a true identifier. 
* Substrings of the target triple are true identifiers for UNSUPPORTED 
  and XFAIL, but not for REQUIRES. (This matches the current behavior.)
* All other identifiers are false.
* Identifiers are [-+=._a-zA-Z0-9]+

Differential Revision: https://reviews.llvm.org/D18185

llvm-svn: 292896
2017-01-24 08:45:50 +00:00