Commit Graph

158335 Commits

Author SHA1 Message Date
Craig Topper 06dad14797 [X86] Remove type restrictions from WidenMaskArithmetic.
This can help AVX-512 code where mask types are legal allowing us to remove extends and truncates to/from mask types.

llvm-svn: 321408
2017-12-23 18:53:05 +00:00
Craig Topper e79a7a4b2e [X86] In WidenMaskArithmetic, make sure we check the input type of a truncate on N1.
Later in the code we explicitly bypass the truncate so we should be checking its type to make sure that it's safe.

llvm-svn: 321407
2017-12-23 18:53:03 +00:00
Craig Topper dbbbb8532c [X86] Remove unneeded EVT variable. NFC
Immediately after it is created we check if its equal to another EVT. Then we inconsistently use one or the other variables in the code below.

Instead do the equality check directly on the getValueType result and remove the variable. Use the origina VT variable throughout the remaining code.

llvm-svn: 321406
2017-12-23 18:53:01 +00:00
Simon Pilgrim fc01bf86d5 [X86][X87] Wrap FpI_ pseudo to use PseudoI. NFCI.
llvm-svn: 321405
2017-12-23 17:25:59 +00:00
Davide Italiano 55b663431e [SCCP] Manually fold branches on undef.
This code was originally removed and replace with an assertion
because believed unnecessary. It turns out there was simply
no test coverage for this case, and the constant folder doesn't
yet know about patterns like `br undef %label1, %label2`.
Presumably at some point the constant folder might learn about
these patterns, but it's a broader change.
A testcase will be added to make sure this doesn't regress again
in the future.

Fixes PR35723.

llvm-svn: 321402
2017-12-23 15:06:30 +00:00
Simon Pilgrim 730cbc8f8e [X86] Add default InstrItinClass to PseudoI
This will be used to help tidyup existing pseudos that we've added scheduling info to.

llvm-svn: 321401
2017-12-23 10:47:21 +00:00
Craig Topper b8e7ab8231 [X86] Pass the right VT to the getZeroExtendInReg introduced in r321398
Apparently we don't have tests for this which I didn't realize before. I'll try to fix that but wanted to fix the obvious bug.

llvm-svn: 321399
2017-12-23 06:52:03 +00:00
Craig Topper ed4a87f6a8 [X86] Use SelectionDAG::getZeroExtendInReg instead of implementing it manually.
llvm-svn: 321398
2017-12-23 02:54:52 +00:00
Craig Topper d6a8f2e67d [SelectionDAG][X86] Don't use ->getValueType(0) after a call to getOperand to get the type of the operand.
getOperand returns an SDValue that contains the node and the result number. There is no guarantee that the result number if 0. By using the -> operator we are calling SDNode::getValueType rather than SDValue::getValueType. This requires supplying a result number and we shouldn't assume it was 0.

I don't have a test case. Just noticed while cleaning up some other code and saw that it occurred in other places.

llvm-svn: 321397
2017-12-23 02:54:50 +00:00
Nirav Dave afeae77058 [DAG] Add missing case check from findbaseoffset merge from r321389.
llvm-svn: 321391
2017-12-22 22:06:56 +00:00
Nirav Dave d8e3633db4 Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.
BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.

Relanding after correcting base address matching check.

llvm-svn: 321389
2017-12-22 21:20:55 +00:00
Walter Lee b074f1428f [git-llvm] Handle files ignored by svn correctly
Summary: Correctly handle files ignored by svn (such as .o files,
which are ignored by default) by adding "--no-ignore" flag to "svn
status" and "svn add".

Differential Revision: https://reviews.llvm.org/D41404

llvm-svn: 321388
2017-12-22 21:19:13 +00:00
Benjamin Kramer 171f69e84e Unbreak the build. Combining chrono with Optional is annoying.
llvm-svn: 321387
2017-12-22 21:18:50 +00:00
Sam Clegg 6006e09169 [WebAssembly] MC: Fix for address taken aliases
Previously, taking the address for an alias would result in:
 "Symbol not found in table index space"

Increase test coverage for weak aliases.

This code should be more efficient too as it avoids building
the `IsAddressTaken` set.

Differential Revision: https://reviews.llvm.org/D41510

llvm-svn: 321384
2017-12-22 20:31:39 +00:00
Alina Sbirlea ca741a87f2 [MemorySSA] Allow reordering of loads that alias in the presence of volatile loads.
Summary:
Make MemorySSA allow reordering of two loads that may alias, when one is volatile.
This makes MemorySSA less conservative and behaving the same as the AliasSetTracker.
For more context, see D16875.

LLVM language reference: "The optimizers must not change the number of volatile operations or change their order of execution relative to other volatile operations. The optimizers may change the order of volatile operations relative to non-volatile operations. This is not Java’s “volatile” and has no cross-thread synchronization behavior."

Reviewers: george.burgess.iv, dberlin

Subscribers: sanjoy, reames, hfinkel, llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D41525

llvm-svn: 321382
2017-12-22 19:54:03 +00:00
Nirav Dave 9c26e92aab Revert "[DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI."
which was causing miscompilations in for some test-suite components.

This reverts commit 3e9de9ff0f3162920a2a3cba51c7dc14b54b4d16.

llvm-svn: 321380
2017-12-22 19:33:56 +00:00
Guozhi Wei 33250340f4 [SimplifyCFG] Don't do if-conversion if there is a long dependence chain
If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.

This patch checks for the long dependence chain, and give up if-conversion if find one.

Differential Revision: https://reviews.llvm.org/D39352

llvm-svn: 321377
2017-12-22 18:54:04 +00:00
Ben Dunbobbin bb534b15a9 [ThinLTO][CachePruning] explicitly disable pruning
In https://reviews.llvm.org/rL321077 and https://reviews.llvm.org/D41231 I fixed a regression in the c-api which prevented the pruning from being *effectively* disabled.

However this approach, helpfully recommended by @labath, is cleaner.
It is also nice to remove the weasel words about effectively disabling from the api comments.

Differential Revision: https://reviews.llvm.org/D41497

llvm-svn: 321376
2017-12-22 18:32:15 +00:00
Sanjoy Das 26d11ca4b0 (Re-landing) Expose a TargetMachine::getTargetTransformInfo function
Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

llvm-svn: 321375
2017-12-22 18:21:59 +00:00
Dmitry Preobrazhensky 471adf7fdc [AMDGPU][MC] Corrected handling of negative expressions
See bug 35716: https://bugs.llvm.org/show_bug.cgi?id=35716

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41488

llvm-svn: 321372
2017-12-22 18:03:35 +00:00
Craig Topper b2368fbdf4 [SelectionDAG] Reverse the order of operands in the ISD::ADD created by TargetLowering::getVectorElementPointer so that the FrameIndex is on the left.
This seems to improve X86's ability to match this into an address computation. Otherwise the other operand gets assigned to the base register and the stack pointer + frame index ends up in the index register. But index registers can't encode ESP/RSP so we end up having to move it into another register to meet the constraint.

I could try to improve the address matcher in X86, but swapping the producer seemed easier. Several other places already have the operands in this order so this is at least consistent.

llvm-svn: 321370
2017-12-22 17:18:13 +00:00
Craig Topper 576335f998 [X86] When lowering insert_vector_elt/extract_vector_elt of vXi1 with a non-constant index just use either a 128-bit type or the vXi8 type with the correct number of elements.
Despite what the comment said there isn't better codegen for 512-bit vectors. The 128/256/512 bit implementation jus stores to memory and loads an element. There's no advantage to doing that with a larger size. In fact in many cases it causes a stack realignment and generates worse code.

llvm-svn: 321369
2017-12-22 17:18:11 +00:00
Craig Topper eff84ed204 [X86] Improve the printing of address mode during isel matching.
Fix some inconsistent new line behavior and only print the FrameIndex when the address mode is a FrameIndexBase addressing mode.

llvm-svn: 321368
2017-12-22 17:18:10 +00:00
Dmitry Preobrazhensky c5b0c172f6 [AMDGPU][MC] Corrected parsing of optional operands for ds_swizzle_b32
See bug 35645: https://bugs.llvm.org/show_bug.cgi?id=35645

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41186

llvm-svn: 321367
2017-12-22 17:13:28 +00:00
Haicheng Wu 6d14dfe8f3 [InlineCost] Find more free binary operations
Currently, inline cost model considers a binary operator as free only if both
its operands are constants. Some simple cases are missing such as a + 0, a - a,
etc. This patch modifies visitBinaryOperator() to call SimplifyBinOp() without
going through simplifyInstruction() to get rid of the constant restriction.
Thus, visitAnd() and visitOr() are not needed.

Differential Revision: https://reviews.llvm.org/D41494

llvm-svn: 321366
2017-12-22 17:09:09 +00:00
Nirav Dave 4ce902657a [DAG] Integrate findBaseOffset address analyses to BaseIndexOffset. NFCI.
BaseIndexOffset supercedes findBaseOffset analysis save only Constant
Pool addresses. Migrate analysis to BaseIndexOffset.

llvm-svn: 321364
2017-12-22 16:59:09 +00:00
Dmitry Preobrazhensky 2713495318 [AMDGPU][MC] Added support of 256- and 512-bit tuples of ttmp registers
See bug 35561: https://bugs.llvm.org/show_bug.cgi?id=35561

This patch also affects implementation of SGPR and VGPR registers though changes are cosmetic.

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D41437

llvm-svn: 321359
2017-12-22 15:18:06 +00:00
Simon Atanasyan 5cd90ccbe3 [mips] Add test case to check that calls to mcount follow long calls / short calls options. NFC
llvm-svn: 321357
2017-12-22 13:45:46 +00:00
Diana Picus 28a6d0e639 [ARM GlobalISel] Support G_INTTOPTR and G_PTRTOINT for s32
Mark conversions between pointers and 32-bit scalars as legal, map them
to the GPR and select to a simple COPY.

llvm-svn: 321356
2017-12-22 13:05:51 +00:00
Diana Picus 68773859c8 [ARM GlobalISel] Support pointer constants
Pointer constants are pretty rare, since we usually represent them as
integer constants and then cast to pointer. One notable exception is the
null pointer constant, which is represented directly as a G_CONSTANT 0
with pointer type. Mark it as legal and make sure it is selected like
any other integer constant.

llvm-svn: 321354
2017-12-22 11:09:18 +00:00
Sam Parker cf426fccd4 [DAGCombine] Revert r321259
Improve ReduceLoadWidth for SRL Patch is causing an issue on the
PPC64 BE santizer.

llvm-svn: 321349
2017-12-22 08:36:25 +00:00
Chandler Carruth 54a5ad3681 Rewrite the cached map used for locating the most precise DIE among
inlined subroutines for a given address.

This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.

This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.

It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.

Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]

All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.

Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.

Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.

Differential Revision: https://reviews.llvm.org/D40987

llvm-svn: 321345
2017-12-22 06:41:23 +00:00
Craig Topper e2873a17d7 [X86] Add missing initialization for the HasPREFETCHWT1 subtarget variable.
llvm-svn: 321340
2017-12-22 03:53:14 +00:00
Craig Topper 67885f5d58 [X86] Enable PRFCHW feature on KNL/KNM and all CPUs inherited from Broadwell.
llvm-svn: 321336
2017-12-22 02:41:12 +00:00
Craig Topper e268598dd3 [X86] Add prefetchwt1 instruction and overhaul priorities and isel enabling for prefetch instructions.
Previously prefetch was only considered legal if sse was enabled, but it should be supported with 3dnow as well.

The prfchw flag now imply at least some form of prefetch without the write hint is available, either the sse or 3dnow version. This is true even if 3dnow and sse are explicitly disabled.

Similarly prefetchwt1 feature implies availability of prefetchw and the the prefetcht0/1/2/nta instructions. This way we can support _MM_HINT_ET0 using prefetchw and _MM_HINT_ET1 with prefetchwt1. And its assumed that if we have levels for the write hint we would have levels for the non-write hint, thus why we enable the sse prefetch instructions.

I believe this behavior is consistent with gcc. I've updated the prefetch.ll to test all of these combinations.

llvm-svn: 321335
2017-12-22 02:30:30 +00:00
Craig Topper 9befe89367 [X86] Use SIGN_EXTEND to implement ANY_EXTEND from vXi1.
llvm-svn: 321334
2017-12-22 02:30:26 +00:00
Eli Friedman 07d94d59a4 inline-fp.ll was moved in r321332; delete it properly.
llvm-svn: 321333
2017-12-22 02:10:40 +00:00
Eli Friedman 39ed9a602b [Inliner] Restrict soft-float inlining penalty.
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.

While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.

Differential Revision: https://reviews.llvm.org/D41522

llvm-svn: 321332
2017-12-22 02:08:08 +00:00
Easwaran Raman a17f220590 Add hasProfileData() to check if a function has profile data. NFC.
Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful to do before adding
synthetic function entry counts but also a useful cleanup IMO even
otherwise. I have used hasProfileData instead of hasRealProfileData as
David had earlier suggested since I think profile implies "real" and I
use the phrase "synthetic entry count" and not "synthetic profile count"
but I am fine calling it hasRealProfileData if you prefer.

Reviewers: davidxl, silvas

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41461

llvm-svn: 321331
2017-12-22 01:33:52 +00:00
Wolfgang Pieb b4ba1aa486 [DWARF] Fix formatting bug with r321295. This fixes a MIPS buildbot failure.
llvm-svn: 321330
2017-12-22 01:12:24 +00:00
Craig Topper 8772228963 [X86] Use SIGN_EXTEND rather than ZERO_EXTEND for lowering extract_vector_elt from vXi1 with a non-const index.
We have a better range of instructions we can use if we can fill with the value i1 value rather than zeroing.

llvm-svn: 321315
2017-12-21 22:08:23 +00:00
Alina Sbirlea 50db8a2086 [ModRefInfo] Add must alias info to ModRefInfo.
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.

Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.

Reviewers: dberlin, hfinkel, george.burgess.iv

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D38862

llvm-svn: 321309
2017-12-21 21:41:53 +00:00
Craig Topper 742ac98d01 [X86] When lowering truncates to vXi1, don't sign extend i16/i8 types to 512-bit if we have VLX.
This should only affect what we do for v8i16. Previously we went to v8i64, but if we have VLX we only need v8i32. This prevents an unnecessary zmm usage.

llvm-svn: 321303
2017-12-21 20:45:13 +00:00
Wolfgang Pieb 6ecd6a8088 [DWARF v5] Rework of string offsets table reader
Reorganizes the DWARF consumer to derive the string offsets table 
contribution's format from the contribution header instead of 
(incorrectly) from the unit's format.

Reviewers: JDevliegehere, aprantl

Differential Revision: https://reviews.llvm.org/D41146
 

llvm-svn: 321295
2017-12-21 19:38:13 +00:00
Craig Topper 410a289b79 [X86] Promote v8i1 shuffles to v8i32 instead of v8i64 if we have VLX.
We should have equally good shuffle options for v8i32 with VLX. This was spotted during my attempts to remove 512-bit vectors from SKX.

We still use 512-bits for v16i1, v32i1, and v64i1. I'm less sure we can handle those well with narrower vectors. i32 and i64 element sizes get the best shuffle support.

llvm-svn: 321291
2017-12-21 18:44:06 +00:00
Simon Pilgrim 4de5bb093c [X86][SSE] Split large PAVGB/PAVGW vectors to legal widths
Patch to allow detectAVGPattern handle vectors larger than the legal size (128 SSE2, 256 AVX2, 512 AVX512BW), splitting the vectors accordingly.

Differential Revision: https://reviews.llvm.org/D41440

llvm-svn: 321288
2017-12-21 18:12:31 +00:00
Francis Visoiu Mistrih c9a8451425 [YAML] Refactor escaping unittests
llvm-svn: 321284
2017-12-21 17:14:13 +00:00
Francis Visoiu Mistrih b2b961a3db [YAML] Fix UTF-8 handling
Previous YAML quoting patches broke UTF-8 printing in YAML: see https://reviews.llvm.org/D41290#961801.

Differential Revision: https://reviews.llvm.org/D41490

llvm-svn: 321283
2017-12-21 17:14:09 +00:00
Krzysztof Parzyszek 2e0f7bd0fe [TableGen] Print more helpful information in case of type contradiction
Dump the failing TreePattern.

llvm-svn: 321282
2017-12-21 17:12:43 +00:00
Simon Pilgrim 695c7da3d5 [DAGCombiner] Remove (xor (xor x, c1), c2) -> (xor x, (xor c1, c2)) fold. NFCI.
More general cases are already handled by constant canonicalization and then the ReassociateOps call at line 5327

llvm-svn: 321280
2017-12-21 16:54:03 +00:00