Commit Graph

141600 Commits

Author SHA1 Message Date
Davide Italiano 7f1bad88c3 [llc] Fix -stop-after=consthoist initializing the pass.
llvm-svn: 288864
2016-12-06 23:49:58 +00:00
Matt Arsenault 269ffdac4e AMDGPU: Fix crash on i16 constant expression
llvm-svn: 288861
2016-12-06 23:18:06 +00:00
Peter Collingbourne 7357b2ad62 LowerTypeTests: Improve performance by optimising type metadata queries.
Requesting metadata for a global is a relatively expensive operation as it
involves a map lookup, but it's one that we need to do relatively frequently in
this pass to collect the list of type metadata nodes associated with a global.
This change improves the performance of type metadata queries by prebuilding
data structures that keep the global together with its list of type metadata,
and changing the pass to use that data structure wherever we were previously
passing global references around.

This change also eliminates some O(N^2) behavior by collecting the list of
globals associated with each type identifier during the first pass over the
list of globals rather than visiting each global to compute that list every
time we add a new type identifier.

Reduces pass runtime on a module containing Chrome's vtables from over 60s
to 0.9s.

Differential Revision: https://reviews.llvm.org/D27484

llvm-svn: 288859
2016-12-06 23:02:13 +00:00
Simon Pilgrim 0559b9e557 [X86][XOP] Add test case for PR31296
llvm-svn: 288858
2016-12-06 22:50:13 +00:00
Eli Friedman 0a76e3241f [CodeGen] Fix result type for SMULO/UMULO legalization
On some platforms (like MSP430) the second element of the result
structure for SMULO/UMULO may have a shorter type than the one
returned by SetCC. We need to truncate it to the right type, or
else some incorrect code may be generated later on.

This fixes issue https://github.com/rust-lang/rust/issues/37829

Patch by Vadzim Dambrouski!

Differential Revision: https://reviews.llvm.org/D27154

llvm-svn: 288857
2016-12-06 22:49:36 +00:00
Matt Arsenault ac066f354a AMDGPU: Fix operand name for v_interp_*
Other VOP instructions call the output vdst

llvm-svn: 288856
2016-12-06 22:29:43 +00:00
Sanjay Patel 5369775a84 [InstSimplify] fixed (?) to not mutate icmps
As Eli noted in the post-commit thread for r288833, the use of
swapOperands() may not be allowed in InstSimplify, so I'm 
removing those calls here pending further review. 

The swap mutates the icmp, and there doesn't appear to be precedent
for instruction mutation in InstSimplify.

I didn't actually have any tests for those cases, so I'm adding
a few here. 

llvm-svn: 288855
2016-12-06 22:09:52 +00:00
Eugene Zelenko 40b89ff81e [IR] Fix some Clang-tidy modernize-use-equals-delete and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 288853
2016-12-06 22:00:57 +00:00
Tom Stellard 175959e350 AMDGPU/SI: Set correct value for amd_kernel_code_t::kernarg_segment_alignment
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27416

llvm-svn: 288852
2016-12-06 21:53:10 +00:00
Davide Italiano 043e66137c [BDCE/DebugInfo] Preserve llvm.dbg.value's argument.
BDCE has two phases:
1. It asks SimplifyDemandedBits if all the bits of an instruction are dead, and if so,
replaces all its uses with the constant zero.
2. Then, it asks SimplifyDemandedBits again if the instruction is really dead
(no side effects etc..) and if so, eliminates it.

Now, in 1) if all the bits of an instruction are dead, we may end up replacing a dbg use:
  %call = tail call i32 (...) @g() #4, !dbg !15
  tail call void @llvm.dbg.value(metadata i32 %call, i64 0, metadata !8, metadata !16), !dbg !17
->
  %call = tail call i32 (...) @g() #4, !dbg !15
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !8, metadata !16), !dbg !17

but not eliminating the call because it may have arbitrary side effects.
In other words, we lose some debug informations.
This patch fixes the problem making sure that BDCE does nothing with the instruction if
it has side effects and no non-dbg uses.

Differential Revision:  https://reviews.llvm.org/D27471

llvm-svn: 288851
2016-12-06 21:52:47 +00:00
Tom Stellard 00cfa74715 AMDGPU/SI: Don't move copies of immediates to the VALU
Summary:
If we write an immediate to a VGPR and then copy the VGPR to an
SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than
moving the copy to the SALU.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27272

llvm-svn: 288849
2016-12-06 21:13:30 +00:00
Tim Northover 14ceb45fb4 GlobalISel: correctly handle small args via memory.
We were rounding size in bits down rather than up, leading to 0-sized slots for
i1 (assert!) and bugs for other types not byte-aligned.

llvm-svn: 288848
2016-12-06 21:02:19 +00:00
Zvi Rackover 8bc7e4da51 [X86] Prefer reduced width multiplication over pmulld on Silvermont
Summary:
Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld.
On Silvermont [source: Optimization Reference Manual]:
PMULLD has a throughput of 1/11 [instruction/cycles].
PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles].

Fixes pr31202.

Analysis of this issue was done by Fahana Aleen.

Reviewers: wmi, delena, mkuper

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D27203

llvm-svn: 288844
2016-12-06 19:35:20 +00:00
Simon Pilgrim dd6ca639d5 [DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine
Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted.

Differential Revision: https://reviews.llvm.org/D27461

llvm-svn: 288842
2016-12-06 19:09:37 +00:00
Sanjay Patel 9b1b2de348 [InstSimplify] add folds for and-of-icmps with same operands
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.

This is the 'and' sibling of the earlier 'or' patch:
https://reviews.llvm.org/rL288833

llvm-svn: 288841
2016-12-06 19:05:46 +00:00
Tim Northover 0a683e7bfd GlobalISel: fall back gracefully when we hit unhandled legalizer default.
llvm-svn: 288840
2016-12-06 19:02:15 +00:00
Simon Pilgrim 1577b39f51 [SelectionDAG] We can ignore knownbits from an undef shuffle vector index if we don't actually demand that element
llvm-svn: 288839
2016-12-06 18:58:25 +00:00
Sanjay Patel 827414876f [InstSimplify] add tests for and-of-icmps; NFC
llvm-svn: 288837
2016-12-06 18:46:54 +00:00
Tim Northover c1a23854f3 GlobalISel: handle G_SEQUENCE fallbacks gracefully.
There were two problems:
  + AArch64 was reusing random data from its binary op tables, which is
    complete nonsense for G_SEQUENCE.
  + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE,
    the generic code asserted.

llvm-svn: 288836
2016-12-06 18:38:38 +00:00
Tim Northover f50f2f3d32 GlobalISel: allow G_SELECT instructions for pointers.
llvm-svn: 288835
2016-12-06 18:38:34 +00:00
Tim Northover 405e25cd6a GlobalISel: stop the legalizer from trying to handle oddly-sized types.
It'll almost immediately fail because it always tries to half/double the size
until it finds a legal one. Unfortunately, this triggers an assertion
preventing the DAG fallback from being possible.

llvm-svn: 288834
2016-12-06 18:38:29 +00:00
Sanjay Patel d0ccdb46b9 [InstSimplify] add folds for or-of-icmps with same operands
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.

llvm-svn: 288833
2016-12-06 18:09:37 +00:00
George Rimar 114d335bf9 [llvm-readobj] - Teach readobj to print PT_OPENBSD_BOOTDATA header
These are OpenBSD specific program headers.

OpenBSD commit:
d39116912b

It is required for fixing PR31288.

Differential revision: https://reviews.llvm.org/D27456

llvm-svn: 288831
2016-12-06 17:55:52 +00:00
Sanjay Patel 6d4444f931 [InstSimplify] add tests for or-of-icmps; NFC
llvm-svn: 288830
2016-12-06 17:49:10 +00:00
Chris Bieneman ec758fad08 [CMake] Fixing clang standalone build
I broke this in r288770.

llvm-svn: 288829
2016-12-06 17:09:29 +00:00
Simon Pilgrim 4a2979ce12 [X86][SSE] Add knownbits test demonstrating demandedelts not ignoring undef shuffle elements
llvm-svn: 288825
2016-12-06 17:00:47 +00:00
Simon Pilgrim 0caaadfc2d [X86][SSE] Added vector sext_in_reg combine tests
llvm-svn: 288819
2016-12-06 15:57:26 +00:00
George Rimar 3840121cdd Removed trailing whitespaces. NFC.
llvm-svn: 288817
2016-12-06 15:40:02 +00:00
George Rimar 92b54b5d87 [Support/ELF] - Add OpenBSD PT_OPENBSD_BOOTDATA constant.
OpenBSD commit for reference:
d39116912b

llvm-svn: 288816
2016-12-06 15:38:15 +00:00
Simon Pilgrim 7c7b649639 [X86] Improve UMAX/UMIN knownbits test
Test the sequential effect of each op

llvm-svn: 288815
2016-12-06 15:17:50 +00:00
Simon Pilgrim 29c17f3f58 Avoid repeated calls to Op.getOpcode(). NFCI.
llvm-svn: 288814
2016-12-06 14:50:09 +00:00
Daniel Sanders 4fd1e7c628 [globalisel][aarch64] Fix unintended assumptions about PartialMappingIdx. NFC.
Summary:
This is NFC but prevents assertions when PartialMappingIdx is tablegen-erated.
The assumptions were:
1) FirstGPR is 0
2) FirstGPR is the first of the First* enumerators.

GPR32 is changed to 1 to demonstrate that assumption #1 is fixed. #2 will
be covered by a subsequent patch that tablegen-erates information and swaps
the order of GPR and FPR as a side effect.

Depends on D27336

Reviewers: ab, t.p.northover, qcolombet

Subscribers: aemerson, rengolin, vkalintiris, dberris, rovka, llvm-commits

Differential Revision: https://reviews.llvm.org/D27337

llvm-svn: 288812
2016-12-06 14:39:57 +00:00
Daniel Sanders 21765cb15e [globalisel][aarch64] Replace magic numbers with corresponding enumerators in ValMappings. NFC
Reviewers: ab, t.p.northover, qcolombet

Subscribers: aemerson, rengolin, vkalintiris, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27336

llvm-svn: 288810
2016-12-06 13:55:01 +00:00
Daniel Sanders 605f8cd30d [globalisel][aarch64] Correct argument names in comments.
llvm-svn: 288809
2016-12-06 13:48:58 +00:00
Simon Pilgrim e633741c3a [SLPVectorizer][X86] Tests to show missed buildvector sitofp/fptosi vectorizations
e.g.
buildvector(sitofp(i32), sitofp(i32), sitofp(i32), sitofp(i32)) --> sitofp(buildvector(i32, i32, i32, i32))

llvm-svn: 288807
2016-12-06 13:29:55 +00:00
Oliver Stannard 870b5cad45 [ARM] Better error message for invalid flag-preserving Thumb1 insts
When we see a non flag-setting instruction for which only the flag-setting
version is available in Thumb1, we should give a better error message than
"invalid instruction".

Differential Revision: https://reviews.llvm.org/D27414

llvm-svn: 288805
2016-12-06 12:59:08 +00:00
Ayman Musa 86c00b799f [X86][AVX512] Detect repeated constant patterns in BUILD_VECTOR suitable for broadcasting.
Check if a build_vector node includes a repeated constant pattern and replace it with a broadcast of that pattern.
For example:
"build_vector <0, 1, 2, 3, 0, 1, 2, 3>" would be replaced by "broadcast <0, 1, 2, 3>"

Differential Revision: https://reviews.llvm.org/D26802

llvm-svn: 288804
2016-12-06 12:24:14 +00:00
Simon Pilgrim ae63dd10f8 [X86] Add tests to show missed opportunities to calculate knownbits in SMAX/SMIN/UMAX/UMIN
llvm-svn: 288801
2016-12-06 12:12:20 +00:00
Nemanja Ivanovic 15748f4921 [PowerPC] Improvements for BUILD_VECTOR Vol. 4
This is the final patch in the series of patches that improves
BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
to remove redundant instructions. It also adds a large test case which
encompasses a large set of code patterns that build vectors - this test
case was the motivator for this series of patches.

Differential Revision: https://reviews.llvm.org/D26066

llvm-svn: 288800
2016-12-06 11:47:14 +00:00
Daniel Sanders bfd5ff155a [globalisel][aarch64] Prefix PartialMappingIdx enumerators with 'PMI_' to fit coding standards.
This also stops things like 'None' polluting the llvm::AArch64 namespace.

llvm-svn: 288799
2016-12-06 11:33:04 +00:00
Simon Pilgrim 4c2885947e Fix MSVC -Wmicrosoft-enum-value 'enumerator value is not representable' warning
llvm-svn: 288798
2016-12-06 11:27:19 +00:00
Simon Pilgrim 9335c020c6 Fix MSVC bool to uint64_t promotion warning
llvm-svn: 288796
2016-12-06 11:12:53 +00:00
Chandler Carruth 23a6c3f746 [LCG] Add some much needed asserts and verify runs to uncover
a hilarious bug and fix it.

We somehow were never verifying the RefSCCs newly formed when
splitting an existing one apart, and when verifying them we weren't
really checking the SCC indices mapping effectively.

If we had been, it would have been blindingly obvious that right after
putting something int `RC.SCCs` we should update `RC.SCCIndices` instead
of `SCCIndices` which we were about to clear and rebuild anyways. =[

Anyways, this is thoroughly covered by existing tests now that we
actually verify things properly.

llvm-svn: 288795
2016-12-06 10:29:23 +00:00
Florian Hahn 7582c669bd [framelowering] Improve tracking of first CS pop instruction.
Summary: This patch makes sure FirstCSPop and MBBI never point to DBG_VALUE instructions, which affected the code generated.

Reviewers: mkuper, aprantl, MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27343

llvm-svn: 288794
2016-12-06 10:24:55 +00:00
Sam McCall 03435f57aa Add missing parens in assert.
Summary: Add missing parens in assert, which warn in GCC.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27448

llvm-svn: 288792
2016-12-06 10:14:36 +00:00
Chandler Carruth 8977223e55 [PM] Basic cleanups to CGSCC update code, NFC.
Just using InstIterator, simpler loop structures, and making better use
of the visit callback infrastructure.

llvm-svn: 288790
2016-12-06 10:06:06 +00:00
Craig Topper b34eef7b41 [X86] Remove another weird scalar sqrt/rcp/rsqrt pattern.
This pattern turned a vector sqrt/rcp/rsqrt operation of sse_load_f32/f64 into the the scalar instruction for the operation and put undef into the upper bits. For correctness, the resulting code should still perform the sqrt/rcp/rsqrt on the upper bits after the load is extended since that's what the operation asked for. Particularly in the case where the upper bits are 0, in that case we need calculate the sqrt/rcp/rsqrt of the zeroes and keep the result in the upper-bits. This implies we should be using the packed instruction still.

The only test case for this pattern is one I just added so there was no coverage of this.

llvm-svn: 288784
2016-12-06 08:08:12 +00:00
Craig Topper 26ce4267ef [X86] Add test case demonstrating a case where a vector sqrt being passed (scalar_to_vector loadf64) uses a scalar sqrt instruction.
This occurs due to a pattern that uses sse_load_f32/f64 with vector sqrt/rcp/rsqrt operations and turns them into scalar instructions. Perhaps for the case were the upper bits come from undef this is ok.  I believe a (vzmovl load64) would do the same thing but those seems to become vzload instead and selectScalarSSELoad doesn't handle that today. In that case we should be performing the vector operation on the zeros in the upper bits which is not equivalent to using a scalar instruction.

I will remove this pattern in a follow up patch. There appears to be no other test content for it.

llvm-svn: 288783
2016-12-06 08:08:09 +00:00
Craig Topper aa2c38378c [X86] Regenerate a test using update_llc_test_checks.py
llvm-svn: 288782
2016-12-06 08:08:07 +00:00
Craig Topper 683470bf1b [X86] Remove bad pattern that caused 128-bit loads being used by scalar sqrt/rcp/rsqrt intrinsics to select the memory form of the corresponding instruction and violate the semantics of the intrinsic.
The intrinsics are supposed to pass the upper bits straight through to their output register. This means we need to make sure we still perform the 128-bit load to get those upper bits to pass to give to the instruction since the memory form of the instruction only reads 32 or 64 bits.

llvm-svn: 288781
2016-12-06 08:08:04 +00:00