Commit Graph

48476 Commits

Author SHA1 Message Date
Sanjay Patel 85b60ff68c [PassManager, SimplifyCFG] add test for PR34603 / D38566; NFC
Sinking common insts and converting to select early can inhibit better folds in other passes.

llvm-svn: 316908
2017-10-30 14:34:30 +00:00
Yaxun Liu c928f2a6d4 [AMDGPU] Emit metadata for hidden arguments for kernel enqueue
Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernels are
marked with calls-enqueue-kernel function attribute by
AMDGPUOpenCLEnqueueKernelLowering pass and later on
hidden kernel arguments metadata HiddenDefaultQueue and
HiddenCompletionAction are emitted for them.

Differential Revision: https://reviews.llvm.org/D39255

llvm-svn: 316907
2017-10-30 14:30:28 +00:00
Clement Courbet b2c3eb8cf1 [CodeGen][ExpandMemcmp] Allow memcmp to expand to vector loads (2).
- Targets that want to support memcmp expansions now return the list of
   supported load sizes.
 - Expansion codegen does not assume that all power-of-two load sizes
   smaller than the max load size are valid. For examples, this is not the
   case for x86(32bit)+sse2.

Fixes PR34887.

llvm-svn: 316905
2017-10-30 14:19:33 +00:00
Javed Absar 5cde1ccb29 [GlobalISel|ARM] : Allow legalizing G_FSUB
Adding support for VSUB.
Reviewed by: @rovka
Differential Revision: https://reviews.llvm.org/D39261

llvm-svn: 316902
2017-10-30 13:51:56 +00:00
Andrew V. Tischenko f94da596a7 Invalid used of 'w' suffix on push and pop using 64-bit register.
Differential Revision: https://reviews.llvm.org/D38626

llvm-svn: 316898
2017-10-30 12:02:06 +00:00
Diana Picus 6e41e6ac6c [ARM GlobalISel] Fixup r316572. NFC
Just missed a few spots...

llvm-svn: 316897
2017-10-30 11:58:09 +00:00
Jina Nahias e63db55c67 Revert "[X86][AVX512] Adding a pattern for broadcastm intrinsic."
This reverts commit r316890.

Change-Id: I683cceee9848ef309b452293086b1f26a941950d
llvm-svn: 316894
2017-10-30 10:35:53 +00:00
Florian Hahn d0208b4b1c Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 316891
2017-10-30 10:07:42 +00:00
Jina Nahias 70280f9a0d [X86][AVX512] Adding a pattern for broadcastm intrinsic.
Differential Revision: https://reviews.llvm.org/D38312

Change-Id: I6551fb13879e098aed74de410e29815cf37d9ab5
llvm-svn: 316890
2017-10-30 09:59:52 +00:00
Max Kazantsev 390fc57771 [IRCE][NFC] Store Length as SCEV in RangeCheck instead of Value
llvm-svn: 316889
2017-10-30 09:35:16 +00:00
Florian Hahn d18443edad Revert r316887 to fix buildbot failures.
llvm-svn: 316888
2017-10-30 09:21:50 +00:00
Florian Hahn 925d3e4a98 Recommit r315288: [SCCP] Propagate integer range info for parameters in IPSCCP.
This version of the patch includes a fix addressing a stage2 LTO buildbot
failure and addressed some additional nits.

Original commit message:
This updates the SCCP solver to use of the ValueElement lattice for
parameters, which provides integer range information. The range
information is used to remove unneeded icmp instructions.

For the following function, f() can be optimized to ret i32 2 with
this change

    source_filename = "sccp.c"
    target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
    target triple = "x86_64-unknown-linux-gnu"

    ; Function Attrs: norecurse nounwind readnone uwtable
    define i32 @main() local_unnamed_addr #0 {
    entry:
      %call = tail call fastcc i32 @f(i32 1)
      %call1 = tail call fastcc i32 @f(i32 47)
      %add3 = add nsw i32 %call, %call1
      ret i32 %add3
    }

    ; Function Attrs: noinline norecurse nounwind readnone uwtable
    define internal fastcc i32 @f(i32 %x) unnamed_addr #1 {
    entry:
      %c1 = icmp sle i32 %x, 100

      %cmp = icmp sgt i32 %x, 300
      %. = select i1 %cmp, i32 1, i32 2
      ret i32 %.
    }

    attributes #1 = { noinline }

Reviewers: davide, sanjoy, efriedma, dberlin

Reviewed By: davide, dberlin

Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D36656

llvm-svn: 316887
2017-10-30 09:04:18 +00:00
Simon Pilgrim 601ae238b7 [SelectionDAG] Add SEXT/AND/XOR/Or demanded elts support to ComputeNumSignBits
llvm-svn: 316875
2017-10-29 22:03:37 +00:00
Simon Pilgrim aa65159b0a [X86][SSE] Split ComputeNumSignBits SEXT/AND/XOR/OR demandedelts test
Max depth was being exceeded which could prevent some combines working

llvm-svn: 316871
2017-10-29 21:35:28 +00:00
Sanjay Patel adf38911d8 [(new) Pass Manager] instantiate SimplifyCFG with the same options as the old PM
The old PM sets the options of what used to be known as "latesimplifycfg" on the 
instantiation after the vectorizers have run, so that's what we'redoing here.

FWIW, there's a later SimplifyCFGPass instantiation in both PMs where we do not 
set the "late" options. I'm not sure if that's intentional or not.

Differential Revision: https://reviews.llvm.org/D39407

llvm-svn: 316869
2017-10-29 20:49:31 +00:00
Simon Pilgrim 61921f779f [X86][SSE] ComputeNumSignBits tests showing missing SEXT/AND/XOR/OR demandedelts support
llvm-svn: 316868
2017-10-29 20:49:27 +00:00
Simon Pilgrim 7613a7b564 [SelectionDAG] Add SRA/SHL demanded elts support to ComputeNumSignBits
Introduce a isConstOrDemandedConstSplat helper function that can recognise a constant splat build vector for at least the demanded elts we care about.

llvm-svn: 316866
2017-10-29 18:19:37 +00:00
Simon Pilgrim b56fb4a2bb [X86][SSE] ComputeNumSignBits tests showing missing SHL/SRA demandedelts support
llvm-svn: 316865
2017-10-29 18:01:31 +00:00
Craig Topper 41d363fea1 [X86] Add a slow-incdec command line to atomic-eflags-reuse.ll
I believe the test_sub_1_cmp_1_setcc_ugt test case is being miscompiled in the fast inc/dec case.

llvm-svn: 316864
2017-10-29 17:15:09 +00:00
Craig Topper 495a1bc893 [X86] Remove combine that turns X86ISD::LSUB into X86ISD::LADD. Update patterns that depended on this.
If the carry flag is being used, this transformation isn't safe.

This does prevent some test cases from using DEC now, but I'll try to look into that separately.

Fixes PR35068.

llvm-svn: 316860
2017-10-29 06:51:04 +00:00
Craig Topper 5f2289a13c [X86] Add AVX512 support to X86FastISel::X86SelectFPExt and X86FastISel::X86SelectFPTrunc.
llvm-svn: 316856
2017-10-29 02:50:31 +00:00
Craig Topper 8373336a22 [X86] Use update_llc_test_checks.py to regenerate fast-isel-int-float-conversion.ll
llvm-svn: 316855
2017-10-29 02:25:48 +00:00
Craig Topper 7d1ed9ec83 [X86] Use update_llc_test_checks.py to regenerate fast-isel-fptrunc-fpext.ll
llvm-svn: 316854
2017-10-29 02:18:43 +00:00
Craig Topper 1e30d783dd [X86] Add AVX512 support to X86FastISel::X86MaterializeFP
llvm-svn: 316853
2017-10-29 02:18:41 +00:00
Simon Pilgrim b37a24e82f [SelectionDAG] Add support for INSERT_SUBVECTOR to computeKnownBits
llvm-svn: 316847
2017-10-28 22:10:40 +00:00
Simon Pilgrim 294f88dfa0 [X86][SSE] Combine 128-bit target shuffles to PACKSS/PACKUS.
llvm-svn: 316845
2017-10-28 20:51:27 +00:00
Sanjay Patel b049173157 [SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.

This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and 
D35411 (defer folding unconditional branches) with pass parameters rather than a named
"latesimplifycfg" pass. Now that we have individual options to control the functionality,
we could decouple when these fire (but that's an independent patch if desired). 

The next planned step would be to add another option bit to disable the sinking transform
mentioned in D38566. This should also make it clear that the new pass manager needs to
be updated to limit simplifycfg in the same way as the old pass manager.

Differential Revision: https://reviews.llvm.org/D38631

llvm-svn: 316835
2017-10-28 18:43:07 +00:00
Craig Topper abe5dbafff [X86] Correct the alignments on the aligned test cases in fast-isel-vecload.ll to make sure they test selection of aligned loads.
llvm-svn: 316833
2017-10-28 17:37:51 +00:00
Simon Pilgrim d09c1ac20f [SelectionDAG] Support 'bit preserving' floating points bitcasts on computeKnownBits/ComputeNumSignBits
For cases where we know the floating point representations match the bitcasted integer equivalent, allow bitcasting to these types.

This is especially useful for the X86 floating point compare results which return all/zero bits but as a floating point type.

Differential Revision: https://reviews.llvm.org/D39289

llvm-svn: 316831
2017-10-28 14:27:53 +00:00
Craig Topper 39cfdc664d [X86] Add avx command lines to fast-isel-constpool.ll to improve coverage.
llvm-svn: 316829
2017-10-28 06:31:48 +00:00
Craig Topper ea83f85da0 [X86] Use update_llc_test_checks.py to regenerate fast-isel-constpool.ll
llvm-svn: 316828
2017-10-28 06:31:46 +00:00
Craig Topper 8ca5863dd8 [X86] Add a fast-isel test for the i8 pseudo cmov.
llvm-svn: 316827
2017-10-28 06:10:03 +00:00
Haicheng Wu eb92e569de [ConstantFold] Fix a crash when folding a GEP that has vector index
LLVM crashes when factoring out an out-of-bound index into preceding dimension
and the preceding dimension uses vector index.  Simply bail out now when this
case happens.

Differential Revision: https://reviews.llvm.org/D38677

llvm-svn: 316824
2017-10-28 02:27:14 +00:00
Craig Topper fd0a35a649 [X86] Add avx command lines to two fast-isel tests to get coverage of selecting vucomiss/vucomisd.
The selection of these shows up as a code coverage hole when looking at the llvm-cov link on llvm.org

llvm-svn: 316823
2017-10-28 02:03:59 +00:00
Craig Topper 4390c61fad [X86] Use update_llc_test_checks.py to regenerate fast-isel-select-cmov2.ll
llvm-svn: 316822
2017-10-28 02:03:58 +00:00
Craig Topper 49687104d6 [PartialInlineLibCalls] Teach PartialInlineLibCalls to honor nobuiltin, properly check the function signature, and check TLI::has
Summary:
We shouldn't do this transformation if the function is marked nobuitlin.

We were only checking that the return type is floating point, we really should be checking the argument types and argument count as well. This can be accomplished by using the other version of getLibFunc that takes the Function and not just the name.

We should also be checking TLI::has since sqrtf is a macro on Windows.

Fixes PR32559.

Reviewers: hfinkel, spatel, davide, efriedma

Reviewed By: davide, efriedma

Subscribers: efriedma, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39381

llvm-svn: 316819
2017-10-28 00:36:58 +00:00
Tom Stellard d0c6cf2e8c AMDGPU/GlobalISel: Mark 32-bit G_FADD as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38439

llvm-svn: 316815
2017-10-27 23:57:41 +00:00
Jake Ehrlich f22728e636 Revert "Add support for writing 64-bit symbol tables for archives when offsets become too large for 32-bit"
This reverts commit r316805.

llvm-svn: 316813
2017-10-27 23:39:31 +00:00
Jake Ehrlich 9d5a7c3b8c Add support for writing 64-bit symbol tables for archives when offsets become too large for 32-bit
This should fix https://bugs.llvm.org//show_bug.cgi?id=34189

This change makes it so that if writing a K_GNU style archive, you need
to output a > 32-bit offset it should output in K_GNU64 style instead.

Differential Revision: https://reviews.llvm.org/D36812

llvm-svn: 316805
2017-10-27 22:26:37 +00:00
Krzysztof Parzyszek 4dc04e6a70 [Hexagon] Adjust patterns to reflect instruction selection preferences
llvm-svn: 316804
2017-10-27 22:24:49 +00:00
Guozhi Wei 7c67009fe5 [DAGCombine] Don't combine sext with extload if sextload is not supported and extload has multi users
In function DAGCombiner::visitSIGN_EXTEND_INREG, sext can be combined with extload even if sextload is not supported by target, then

  if sext is the only user of extload, there is no big difference, no harm no benefit.
  if extload has more than one user, the combined sextload may block extload from combining with other zext, causes extra zext instructions generated. As demonstrated by the attached test case.

This patch add the constraint that when sextload is not supported by target, sext can only be combined with extload if it is the only user of extload.

Differential Revision: https://reviews.llvm.org/D39108

llvm-svn: 316802
2017-10-27 21:54:24 +00:00
Rafael Espindola 2393c3b4e1 Handle undefined weak hidden symbols on all architectures.
We were handling the non-hidden case in lib/Target/TargetMachine.cpp,
but the hidden case was handled in architecture dependent code and
only X86_64 and AArch64 were covered.

While it is true that some code sequences in some ABIs might be able
to produce the correct value at runtime, that doesn't seem to be the
common case.

I left the AArch64 code in place since it also forces a got access for
non-pic code. It is not clear if that is needed, but it is probably
better to change that in another commit.

llvm-svn: 316799
2017-10-27 21:18:48 +00:00
Craig Topper b904c70005 [X86] Add fast-isel tests for integer shifts. We definitely had no coverage of i16 and i32/i64 are only tested by larger tests.
llvm-svn: 316796
2017-10-27 21:00:56 +00:00
Artur Gainullin af7ba8ff6b Improve clamp recognition in ValueTracking.
Summary:
ValueTracking was recognizing not all variations of clamp. Swapping of
true value and false value of select was added to fix this problem. The
first patch was reverted because it caused miscompile in NVPTX target. 
Added corresponding test cases.

Reviewers: spatel, majnemer, efriedma, reames

Subscribers: llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D39240

llvm-svn: 316795
2017-10-27 20:53:41 +00:00
Craig Topper 58fe564e93 [X86] Add avx512vl command line to fast-isel-nontemporal.ll
llvm-svn: 316789
2017-10-27 20:13:06 +00:00
Krzysztof Parzyszek 92a2635bbd [Hexagon] Fix an incorrect assertion in HexagonConstExtenders.cpp
Making sure that an instruction has fewer operands than required, then
attempting to access one out of range is going to fail.

llvm-svn: 316785
2017-10-27 18:52:28 +00:00
Simon Pilgrim 1bfaa453a3 [X86][SSE] Add tests for inserting all-bits (-1) into a vector
We should be able to do this by re-materializing an all-bits vector and then blending with it

llvm-svn: 316779
2017-10-27 18:14:12 +00:00
Artur Pilipenko 8aadc643cf [LoopPredication] Handle the case when the guard and the latch IV have different offsets
This is a follow up change for D37569.

Currently the transformation is limited to the case when:
 * The loop has a single latch with the condition of the form: ++i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=.
 * The step of the IV used in the latch condition is 1.
 * The IV of the latch condition is the same as the post increment IV of the guard condition.
 * The guard condition is of the form i u< guardLimit.

This patch enables the transform in the case when the latch is

 latchStart + i <pred> latchLimit, where <pred> is u<, u<=, s<, or s<=.

And the guard is

 guardStart + i u< guardLimit

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D39097

llvm-svn: 316768
2017-10-27 14:46:17 +00:00
Clement Courbet be684eee82 [CodeGen][ExpandMemCmp][NFC] Simplify load sequence generation.
llvm-svn: 316763
2017-10-27 12:34:18 +00:00
George Rimar 144e4c5a32 [llvm-dwarfdump] - Teach verifier to report broken DWARF expressions.
Patch improves next things:

* Fixes assert/crash in getOpDesc when giving it a invalid expression op code.
* DWARFExpression::print() called DWARFExpression::Operation::getEndOffset() which
  returned and used uninitialized field EndOffset. Patch fixes that.
* Teaches verifier to verify DW_AT_location and error out on broken expressions.

Differential revision: https://reviews.llvm.org/D39294

llvm-svn: 316756
2017-10-27 10:42:04 +00:00