Commit Graph

160750 Commits

Author SHA1 Message Date
Jonas Devlieghere 7779203d80 [dsymutil][test] Add PowerPC test
Add test that verifies that we don't follow DWARF values with a
reference form class, such as DW_AT_sibling.

Since clang doesn't generate the latter attribute, we added a PowerPC
test generated on an old PowerBook G4. (Thanks Adrian!)

llvm-svn: 326183
2018-02-27 10:28:43 +00:00
Jonas Devlieghere 425b248128 [ADT] Recognize ppc as valid architecture in target triple.
Until this patch, only `powerpc` and `ppc32` were recognized as valid
PowerPC 32-bit architectures in a target triple. This was incompatible
with the triple `ppc-apple-darwin` as returned for libObject. I found
out about this when working on a test case using a binary generated on
an old PowerBook G4.

We had the choice of either fix this in the Mach-O object parser or
in the Triple implementation. I chose the latter because it feels like
the most canonical place.

Differential revision: https://reviews.llvm.org/D43760

llvm-svn: 326182
2018-02-27 10:09:58 +00:00
Florian Hahn 1807c516c7 [NewGVN] Update phi-of-ops def block when updating existing ValuePHI.
In case we update a ValuePHI node created earlier, we could update it
based on a different OpPHI which could be in a different block.
We need to update the TempToBlock mapping reflecting the new block,
otherwise we would end up placing the new phi node in a wrong block.

This problem is exposed by the test case in
https://bugs.llvm.org/show_bug.cgi?id=36504.

This patch fixes a slightly simpler problem than in the bug report. In
the bug's re-producer, the additional problem is that we are re-using a
ValuePHI node with to few incoming values for the new OpPHI. If this
patch makes sense, I will follow it up with a patch that creates a new
PHI node if the existing PHI node has a different number of incoming
values.

Reviewers: davide, dberlin

Reviewed By: dberlin

Differential Revision: https://reviews.llvm.org/D43770

llvm-svn: 326181
2018-02-27 09:34:51 +00:00
Jonas Paulsson f268cd0aad [SystemZ] Make sure SelectCode() is not called on a target opcode.
Since getNode() might not always return the requsted opcode, for instance if
called with (ISD::AND, -1) arguments, there should be a check so that
SelectCode() is only called when appropriate.

Review: Ulrich Weigand
llvm-svn: 326178
2018-02-27 07:53:23 +00:00
George Burgess IV debfbcf86e [MemorySSA] Invalidate def caches on deletion
The only cases I can come up with where this invalidation needs to
happen is when there's a deletion somewhere. If we find more creative
test-cases, we can probably go with another approach mentioned on
PR36529.

Fixes PR36529.

llvm-svn: 326177
2018-02-27 07:20:49 +00:00
George Burgess IV 612cf21ec7 [MemorySSA] Call the correct dtors
It appears that there were many cases where we were directly (through
templates) calling the dtor of MemoryAccess, which is conceptually an
abstract class.

This hasn't been a problem, since the data members of all of the
subclasses of MemoryAccess have been POD. I'm planning on changing that.
:)

llvm-svn: 326175
2018-02-27 06:43:19 +00:00
Serguei Katkov 1137cde9fe [SCEV] Cleanup SCEVInitRewriter. NFC.
Set default value for IgnoreOtherLoops of SCEVInitRewriter::rewrite to true
to be consistent with SCEVPostIncRewriter which does not have this parameter
but behaves as it would be true.

This is follow up for rL326067.

llvm-svn: 326174
2018-02-27 06:39:31 +00:00
Craig Topper 264707bae4 [X86] Simplify if condition. NFC
SSE2 implies SSE1 and we already covered f32 in the SSE1 check so we don't need to check f32 in the SSE2 check.

llvm-svn: 326170
2018-02-27 06:00:38 +00:00
Adam Nemet b424cd5d61 Make test agnostic to cost model
This was causing bot failures on greendragon

llvm-svn: 326169
2018-02-27 05:41:16 +00:00
Craig Topper fcaa0323ec [X86] Replace an impossible if condition with an assert.
llvm-svn: 326167
2018-02-27 03:50:00 +00:00
Evgeny Stupachenko f1c058d99b Fix r326154 buildbots test fail
Summary:

Add specific mtriples to tests added in r326154.

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 326158
2018-02-27 01:33:11 +00:00
Evgeny Stupachenko a732611ac8 Fix PR36032, PR35432
Summary:

The change fix an assert fail at ScalarEvolutionExpander.cpp:
  assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");

Reviewers: sbaranga

Differential Revision: http://reviews.llvm.org/D42604

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 326154
2018-02-27 00:17:31 +00:00
Craig Topper 6df870ca58 [SelectionDAG] Remove code from PromoteIntRes_CONCAT_VECTORS that was added in r320674 to help X86.
AVX512 used to promote v32i1 to v32i8 during legalization when BWI was disabled. So this code was added to improve legalization of v32i1 concat_vectors of v16i1 by extending the v16i1 to v16i8 to avoid scalarization.

X86 has since switched to legalizing v32i1 by splitting to v16i1 instead. This has rendered this code unnecessary and its no longer exercised.

llvm-svn: 326153
2018-02-27 00:07:24 +00:00
Sanjay Patel 66911b16e6 [InstCombine, InstSimplify] add tests with undef elements in constant FP vectors; NFC
llvm-svn: 326148
2018-02-26 23:23:02 +00:00
Evandro Menezes 717941e132 [AArch64] Harden test cases
NFC

llvm-svn: 326147
2018-02-26 23:19:25 +00:00
Aditya Nandakumar 599990530e [GISel]: Don't assert when constraining RegisterOperands which are uses.
Currently we assert that only non target specific opcodes can have
missing RegisterClass constraints in the MCDesc. The backend can have
instructions with register operands but don't have RegisterClass
constraints (say using unknown_class) in which case the instruction
defining the register will constrain it.
Change the assert to only fire if a def has no regclass.

https://reviews.llvm.org/D43409

llvm-svn: 326142
2018-02-26 22:56:21 +00:00
Craig Topper 69c8972fd1 [ValueTracking] Teach cannotBeOrderedLessThanZeroImpl to handle vector constants.
Summary: This allows vector fabs to be removed in more cases.

Reviewers: spatel, arsenm, RKSimon

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43739

llvm-svn: 326138
2018-02-26 22:33:17 +00:00
Simon Pilgrim 9929f90740 [X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280)
Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.

Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.

Differential Revision: https://reviews.llvm.org/D43733

llvm-svn: 326133
2018-02-26 22:10:17 +00:00
Scott Linder a04793eb93 [DebugInfo] Remove target-specific instructions in test
This AsmParser test is target-agnostic, but contained some target-specific
instructions, which broke on SystemZ.

llvm-svn: 326129
2018-02-26 21:21:19 +00:00
Craig Topper e5d39e42b9 [X86] Add constant folding to combineMOVMSK.
There's still some shortcoming in our ability to combine binops of constants with different sizes separated by an extend. I'll try to look at that next.

llvm-svn: 326128
2018-02-26 21:17:33 +00:00
Adam Nemet 713eb05c8c [opt-viewer] Kill parser processes before moving onto rendering
The main benefit is that they release the memory they were holding onto.

llvm-svn: 326127
2018-02-26 21:15:51 +00:00
Adam Nemet 9dea9b4918 opt-diff: Support splitting to multiple output files
When reading the resulting files back with opt-viewer, they will be parsed in
parallel.

llvm-svn: 326126
2018-02-26 21:15:51 +00:00
Adam Nemet f7778892d2 [opt-viewer] Set title for the source pages
llvm-svn: 326125
2018-02-26 21:15:50 +00:00
Adam Nemet cb651c05d6 opt-viewer: also find thinlto opt.yaml files
llvm-svn: 326124
2018-02-26 21:15:49 +00:00
Adam Nemet 6fd19ca763 opt-viewer: output index first
One can start looking at the index while the pages are still generating

llvm-svn: 326123
2018-02-26 21:15:47 +00:00
Craig Topper 5e0ceb8865 [X86] Add a custom legalization for (i16 (bitcast v16i1)) and (i32 (bitcast v32i1)) without AVX512 to prevent scalarization
Summary:
We have an early DAG combine to turn these patterns into MOVMSK, but that combine doesn't work if the vXi1 type has more elements than the widest legal vXi8 type. Type legalization will eventually split it down to v16i1 or v32i1 and then the bitcast gets legalized to a truncstore and a scalar load. The truncstore will get lowered to a series of extracts and bit math.

This patch adds a custom legalization to use a sign extend and MOVMSK instead. This prevents the eventual scalarization.

Reviewers: spatel, RKSimon, zvi

Reviewed By: RKSimon

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D43593

llvm-svn: 326119
2018-02-26 20:32:27 +00:00
Alexey Bataev b44e2b75e8 [SLP] Added new test + fixed some checks, NFC.
llvm-svn: 326117
2018-02-26 20:01:24 +00:00
Craig Topper 43fb1cdef7 [InstCombine] Add test cases with vector constants to fpextend.ll
llvm-svn: 326115
2018-02-26 19:36:37 +00:00
Craig Topper b284e8b9b4 [InstCombine] Switch to using FileCheck instead of grep. Auto-generate checks. NFC
llvm-svn: 326114
2018-02-26 19:36:36 +00:00
David Zarzycki d15f31936a [ADT] Simplify and optimize StringSwitch
This change improves incremental rebuild performance on dual Xeon 8168
machines by 54%. This change also improves run time code gen by not
forcing the case values to be lvalues.

llvm-svn: 326109
2018-02-26 18:41:26 +00:00
Adam Nemet b4ce3573c4 [LTO] Support filtering by hotness threshold
This wires up -pass-remarks-hotness-threshold to LTO and ThinLTO.

Next is to change the clang driver to pass this
with -fdiagnostics-hotness-threshold.

Differential Revision: https://reviews.llvm.org/D41465

llvm-svn: 326107
2018-02-26 18:37:45 +00:00
Simon Pilgrim db0ed7d724 [X86][AVX] createPSADBW - support 256-bit cases on AVX1 via SplitBinaryOpsAndApply
llvm-svn: 326104
2018-02-26 18:17:25 +00:00
Matt Arsenault 2a26a286db AMDGPU/GlobalISel: Make f64 constants legal
llvm-svn: 326101
2018-02-26 17:20:43 +00:00
Sanjay Patel 31a90468e1 [InstCombine] allow fdiv folds with less than fully 'fast' ops
Note: gcc appears to allow this fold with -freciprocal-math alone, 
but clang/llvm require more than that with this patch. The wording
in the definitions seems fuzzy enough that it could go either way,
but we'll err on the conservative side of FMF interpretation.

This patch also changes the newly created fmul to have FMF propagated
by the last fdiv rather than intersecting the FMF of the fdivs. This
matches the behavior of other folds near here. The new fmul is only 
used to produce an intermediate op for the final fdiv result, so it
shouldn't be any stricter than that result. The previous behavior
could result in dropping FMF via other folds in instcombine or CSE.

Differential Revision: https://reviews.llvm.org/D43398

llvm-svn: 326098
2018-02-26 16:02:45 +00:00
Simon Pilgrim 2f0aab9209 [X86][AVX] Add AVX1 PSAD tests
Cleanup check-prefixes to share more AVX/AVX512 codegen checks

llvm-svn: 326097
2018-02-26 15:55:25 +00:00
Ilya Biryukov d9d9bf8d13 Revert r326092: [gtest] Add PrintTo overload for StringRef.
It seems to break the following buildbot:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/24729

Will resubmit after investigating and fixing it.

llvm-svn: 326096
2018-02-26 15:54:59 +00:00
Francis Visoiu Mistrih e4fae4d5b6 [CodeGen] Don't omit any redundant information in -debug output
In r322867, we introduced IsStandalone when printing MIR in -debug
output. The default behaviour for that was:

1) If any of MBB, MI, or MO are -debug-printed separately, don't omit any
redundant information.

2) When -debug-printing a MF entirely, don't print any redundant
information.

3) When printing MIR, don't print any redundant information.

I'd like to change 2) to:

2) When -debug-printing a MF entirely, don't omit any redundant information.

Differential Revision: https://reviews.llvm.org/D43337

llvm-svn: 326094
2018-02-26 15:23:42 +00:00
Simon Pilgrim 98fcd2eb27 [X86][SSE] Regenerate PSAD tests
Fixes scary typo in a check that lost the end digit off a reg#...

llvm-svn: 326093
2018-02-26 15:21:58 +00:00
Ilya Biryukov ab6554fefb [gtest] Add PrintTo overload for StringRef.
Summary:
It was printed using code for generic containers before, resulting in
unreadable output.

Reviewers: sammccall, labath

Reviewed By: sammccall, labath

Subscribers: labath, zturner, llvm-commits

Differential Revision: https://reviews.llvm.org/D43330

llvm-svn: 326092
2018-02-26 15:19:26 +00:00
Jonas Devlieghere 560ce2c70f Re-land: "[Support] Replace HashString with djbHash."
This patch removes the HashString function from StringExtraces and
replaces its uses with calls to djbHash from DJB.h.

This change is *almost* NFC. While the algorithm is identical, the
djbHash implementation in StringExtras used 0 as its default seed while
the implementation in DJB uses 5381. The latter has been shown to result
in less collisions and improved avalanching and is used by the DWARF
accelerator tables.

Because some test were implicitly relying on the hash order, I've
reverted to using zero as a seed for the following two files:

  lld/include/lld/Core/SymbolTable.h
  llvm/lib/Support/StringMap.cpp

Differential revision: https://reviews.llvm.org/D43615

llvm-svn: 326091
2018-02-26 15:16:42 +00:00
Tim Renouf 832f90fa0c [AMDGPU] Scratch setup fix on AMDPAL gfx9+ merge shader
Summary:
With OS type AMDPAL, the scratch descriptor is hardwired to be loaded
from offset 0 of the global information table, whose low pointer is
passed in s0. For a merge shader on gfx9+, it needs to be s8 instead, as
the hardware reserves s0-s7.

Reviewers: kzhuravl

Subscribers: arsenm, nhaehnle, dstuttard, llvm-commits, t-tye, yaxunl, wdng, kzhuravl

Differential Revision: https://reviews.llvm.org/D42203

llvm-svn: 326088
2018-02-26 14:46:43 +00:00
Tim Renouf f40707a2db [LiveIntervals] Handle moving up dead partial write
Summary:
In the test case, the machine scheduler moves a dead write to a subreg
up into the middle of a segment of the overall reg's live range, where
the segment had liveness only for other subregs in the reg.
handleMoveUp created an invalid live range, causing an assert a bit
later.

This commit fixes it to handle that situation. The segment is split in
two at the insertion point, and the part after the split, and any
subsequent segments up to the old position, are changed to be defined by
the moved def.

V2: Better test.

Subscribers: MatzeB, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D43478

Change-Id: Ibc42445ddca84e79ad1f616401015d22bc63832e
llvm-svn: 326087
2018-02-26 14:42:13 +00:00
David Zarzycki f3fa88b288 Test commit
llvm-svn: 326085
2018-02-26 13:05:18 +00:00
Jonas Devlieghere 370bf3ef49 Revert "[Support] Replace HashString with djbHash."
It looks like some of our tests depend on the ordering of hashed values.
I'm reverting my changes while I try to reproduce and fix this locally.

Failing builds:

  lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/18388
  lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/6743
  lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/15607

llvm-svn: 326082
2018-02-26 12:05:18 +00:00
Jonas Devlieghere b9ad175935 [Support] Replace HashString with djbHash.
This removes the HashString function from StringExtraces and replaces
its uses with calls to djbHash from DJB.h

This is *almost* NFC. While the algorithm is identical, the djbHash
implementation in StringExtras used 0 as its seed while the
implementation in DJB uses 5381. The latter has been shown to result in
less collisions and improved avalanching.

https://reviews.llvm.org/D43615
(cherry picked from commit 77f7f965bc9499a9ae768a296ca5a1f7347d1d2c)

llvm-svn: 326081
2018-02-26 11:30:13 +00:00
Benjamin Kramer b84e158df7 [WebAssembly] Relax constexpr for old standard libraries.
This will still be constexpr when the standard library supports it, but
doesn't force constexpr. Old libraries will get a global constructor,
which is not too bad.

llvm-svn: 326080
2018-02-26 11:07:25 +00:00
Renato Golin 9d1b2acaaa [LV] Move isLegalMasked* functions from Legality to CostModel
All SIMD architectures can emulate masked load/store/gather/scatter
through element-wise condition check, scalar load/store, and
insert/extract. Therefore, bailing out of vectorization as legality
failure, when they return false, is incorrect. We should proceed to cost
model and determine profitability.

This patch is to address the vectorizer's architectural limitation
described above. As such, I tried to keep the cost model and
vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning
should be done separately.

Please see
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for
RFC and the discussions.

Closes D43208.

Patch by: Hideki Saito <hideki.saito@intel.com>

llvm-svn: 326079
2018-02-26 11:06:36 +00:00
Florian Hahn ed45836253 [LoopInterchange] Add test case for D43236.
llvm-svn: 326078
2018-02-26 10:46:25 +00:00
Florian Hahn a1822cbabc [LoopInterchange] Loops with empty dependency matrix are safe.
The dependency matrix is only empty if no conflicting load/store
instructions have been found. In that case, it is safe to interchange.

For the LLVM test-suite, after this change around 1900 loops are
interchanged, whereas it is 15 before this change. On cortex-a57,
this gives an improvement of -0.57% on the geomean execution
time of SPEC2006, SPEC2000 and the test-suite. There are a
few small perf regressions, but I think we can improve on those
by making the cost model better.

Reviewers: karthikthecool, mcrosier

Reviewed by: karthikthecool

Differential Revision: https://reviews.llvm.org/D43236

llvm-svn: 326077
2018-02-26 10:45:25 +00:00
Andrew V. Tischenko 083891925b The final step to close D41278 [MachineCombiner] Improve debug output (NFC).
Differential Revision: https://reviews.llvm.org/D41278

llvm-svn: 326074
2018-02-26 09:43:21 +00:00
Serguei Katkov c2f74638ac [SCEV] Factor out getUsedLoops
The patch introduces the new function in ScalarEvolution to get
all loops used in specified SCEV.

This is a preparation for re-writing isKnownPredicate utility as
described in https://reviews.llvm.org/D42417.

Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43504

llvm-svn: 326072
2018-02-26 09:26:41 +00:00
Serguei Katkov a95d2aee7d [SCEV] Introduce SCEVPostIncRewriter
The patch introduces the SCEVPostIncRewriter rewriter which
is similar to SCEVInitRewriter but rewrites AddRec with post increment
value of this AddRec.

This is a preparation for re-writing isKnownPredicate utility as
described in https://reviews.llvm.org/D42417.

Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43499

llvm-svn: 326071
2018-02-26 08:40:18 +00:00
Jonas Paulsson b1e81479e9 [XCore] Return true in enableMultipleCopyHints().
Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: Robert Lytton
llvm-svn: 326069
2018-02-26 08:03:32 +00:00
Craig Topper f340a0e8c0 [X86] Add avx1 command line to madd.ll to show splitting and concatenating 256-bit operations.
llvm-svn: 326068
2018-02-26 07:48:17 +00:00
Serguei Katkov 339c2e8287 [SCEV] Extends the SCEVInitRewriter
The patch introduces an additional parameter IgnoreOtherLoops to SCEVInitRewriter.
if it is equal to true then rewriter will not invalidate result in case
SCEV depends on other loops then specified during creation.

The patch does not change the default behavior.
This is a preparation for re-writing isKnownPredicate utility as
described in https://reviews.llvm.org/D42417.

Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43498

llvm-svn: 326067
2018-02-26 07:08:56 +00:00
Craig Topper 5c980eba47 [X86] Don't use getZExtValue when we have no idea how large the input elements are.
llvm-svn: 326066
2018-02-26 04:43:24 +00:00
Craig Topper 2286058f46 [X86] Use SelectionDAG::SplitVectorOperand to simplify some code. NFC
llvm-svn: 326065
2018-02-26 02:16:34 +00:00
Craig Topper 2bf8e3e0e1 [X86] Simplify the ReplaceNodeResults code for X86ISD::AVG.
This code seemed to try to widen to 128, 256, or 512 bit vectors, but we only create X86ISD::AVG with a power of 2 number of elements. This means the only nodes that need to be legalized are less than 128-bits and need to be widened up to 128 bits.

llvm-svn: 326064
2018-02-26 02:16:33 +00:00
Craig Topper 79d189f597 [X86] Remove VT.isSimple() check from detectAVGPattern.
Which types are considered 'simple' is a function of the requirements of all targets that LLVM supports. That shouldn't directly affect what types we are able to handle. The remainder of this code checks that the number of elements is a power of 2 and takes care of splitting down to a legal size.

llvm-svn: 326063
2018-02-26 02:16:31 +00:00
Nicolai Haehnle 0d4ad84aa2 TableGen: Remove VarInit::getFieldType
It is redundant with the implementation in TypedInit.

Change-Id: I8ab1fb5c77e4923f7eb3ffae5889f0f8af6093b4

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43678

llvm-svn: 326061
2018-02-25 20:50:17 +00:00
Nicolai Haehnle 85e4e95e6c TableGen: Get rid of Init::getFieldInit
Summary:
FieldInit will just rely on the standardized resolving mechanism to give
us DefInits for folding, thus simplifying the code.

Unlike the removal of resolveListElementReference, this shouldn't have
performance implications, because DefInits do not recurse inside their
record.

Change-Id: Id4544c774c9d9ee92f293615af6ecff706453f21

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43563

llvm-svn: 326060
2018-02-25 20:50:11 +00:00
Nicolai Haehnle 801403acb3 TableGen: Remove Init::resolveListElementReference
Summary:
Resolving a VarListElementInit should just resolve the list and then
take its element. This eliminates a lot of duplicated logic and
simplifies the next steps of refactoring resolveReferences.

This does potentially cause sub-elements of the entire list to be
resolved resulting in more work, but I didn't notice a measurable
change in performance, and a later patch adds a caching mechanism that
covers at least the common case of `var[i]` in a more generic way.

Change-Id: I7b59185b855c7368585c329c31e5be38c5749dac

Reviewers: arsenm, craig.topper, tra, MartinO

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D43562

llvm-svn: 326059
2018-02-25 20:50:04 +00:00
Mandeep Singh Grang 46d02dee65 [DebugInfo] Stable sort symbols to remove non-deterministic ordering
Summary: This fixes failure in DebugInfo/X86/multiple-aranges.ll uncovered by D39245.

Reviewers: rafael, echristo, probinson

Reviewed By: probinson

Subscribers: probinson, llvm-commits, JDevlieghere

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D39950

llvm-svn: 326056
2018-02-25 19:52:34 +00:00
Craig Topper aee341ef28 [InstSimplify] Add test cases for removal of vector fabs on known positive.
llvm-svn: 326050
2018-02-25 06:51:52 +00:00
Craig Topper 2b8f051aaa [InstSimplify] Remove unused parameter from test cases.
llvm-svn: 326049
2018-02-25 06:51:51 +00:00
Craig Topper 6694df14e6 [X86] Use SDNode instead of SDPatternOperator. NFC
llvm-svn: 326048
2018-02-25 06:21:04 +00:00
Simon Pilgrim 295e8b4e12 [TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through ADD/SUB ops
llvm-svn: 326044
2018-02-24 20:59:14 +00:00
Simon Pilgrim c0dbdb86c3 [TargetLowering] SimplifyDemandedVectorElts - pass demanded elts through TRUNCATE ops
llvm-svn: 326043
2018-02-24 19:28:34 +00:00
Craig Topper a6f8100788 [X86] Add cvt tests to avx512vl-intrinsics-fast-isel.ll
llvm-svn: 326042
2018-02-24 18:58:08 +00:00
Craig Topper 81c0eaf4c8 [X86] Allow int_x86_sse2_cvtps2dq and int_x86_avx_cvt_ps2dq_256 to select EVEX encoded instructions.
llvm-svn: 326041
2018-02-24 18:58:07 +00:00
Craig Topper fe191ea950 [X86] Remove GCCBuiltin from some intrinsics that are no longer used by clang.
llvm-svn: 326040
2018-02-24 18:58:02 +00:00
Adam Nemet e4e1de60aa Revert "StructurizeCFG: Test for branch divergence correctly"
This reverts commit r325881.

Breaks many bots

llvm-svn: 326037
2018-02-24 17:29:09 +00:00
Scott Linder 4137bc68a4 [DebugInfo] Fix buildbot failure on non-X86 targets
llvm-svn: 326035
2018-02-24 16:25:43 +00:00
Simon Pilgrim a4fb569483 [X86][SSE] combineSubToSubus - support v8i64 handling from SSSE3
Our UMIN/UMAX, vector truncation and shuffle combining is good enough to efficiently handle v8i64 with the number of leading zeros that are necessary for PSUBUS.

llvm-svn: 326034
2018-02-24 14:06:39 +00:00
Simon Pilgrim 8ad91261e8 [X86][SSE] combineSubToSubus - support v8i32 handling from SSSE3 (not SSE41)
Now that UMIN etc are Legal/Custom for SSE2+, we can efficiently match SUBUS v8i32 cases from SSSE3 which can perform efficient truncation with PSHUFB.

llvm-svn: 326033
2018-02-24 13:39:13 +00:00
Simon Pilgrim 744f008a75 [X86][SSE] combineSubToSubus - begun generalizing to work with any type sizes with SplitBinaryOpsAndApply
llvm-svn: 326030
2018-02-24 12:44:12 +00:00
Simon Pilgrim 51ce2ed367 Fix spelling in comment. NFCI.
llvm-svn: 326029
2018-02-24 12:27:02 +00:00
Jonas Paulsson 8ff0773b13 [Sparc] Return true in enableMultipleCopyHints().
Enable multiple COPY hints to eliminate more COPYs during register allocation.

Note that this is something all targets should do, see
https://reviews.llvm.org/D38128.

Review: James Y Knight
llvm-svn: 326028
2018-02-24 08:24:31 +00:00
Craig Topper dc1797e346 [X86] Remove GCCBuiltin from some intrinsics that are no longer used by clang.
llvm-svn: 326026
2018-02-24 07:02:24 +00:00
Craig Topper 161c805da4 [X86] Use SelectionDAG::getNot instead of implementing manually. NFC
llvm-svn: 326020
2018-02-24 03:15:54 +00:00
Stanislav Mekhanoshin fa48c496e2 [AMDGPU] Shrinking V_SUBBREV_U32
V_SUBBREV_U32 is a commute opcode for V_SUBB_U32. However, when
we try to commute V_SUBB_U32 in order to shrink it we do not then
process V_SUBBREV_U32 and it stay VOP3. This is fixed.

Differential Revision: https://reviews.llvm.org/D43699

llvm-svn: 326011
2018-02-24 01:32:32 +00:00
Pavel Labath 725c035f54 Fix build breakage from r326003
- an ambiguous reference to Optional<T> in llvm-dwarfdump.cpp (fixed
  with an explicit prefix).
- a missing base class initialization in Entry copy constructor (fixed
  by using the implicitly default constructor, which is possible after
  some changes which were done during review).

llvm-svn: 326006
2018-02-24 00:54:31 +00:00
Alexander Shaposhnikov a8f15504c1 [llvm-objcopy] Fix typo in setSymTab
This diff fixes the name of the argument of 
setSymTab and makes setSymTab/setStrTab private 
(to make the public interface a bit cleaner).

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D43661

llvm-svn: 326005
2018-02-24 00:41:01 +00:00
Heejin Ahn 9386bde11b [WebAssembly] Add exception handling option and feature
Summary:
Add a llc command line option and WebAssembly architecture feature for
exception handling.

Reviewers: dschuff

Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D43683

llvm-svn: 326004
2018-02-24 00:40:50 +00:00
Pavel Labath d99072bc97 Implement equal_range for the DWARF v5 accelerator table
Summary:
This patch implements the name lookup functionality of the .debug_names
accelerator table and hooks it up to "llvm-dwarfdump -find". To make the
interface of the two kinds of accelerator tables more consistent, I've
created an abstract "DWARFAcceleratorTable::Entry" class, which provides
a consistent interface to access the common functionality of the table
entries (such as getting the die offset, die tag, etc.). I've also
modified the apple table to vend entries conforming to this interface.

Reviewers: JDevlieghere, aprantl, probinson, dblaikie

Subscribers: vleschuk, clayborg, echristo, llvm-commits

Differential Revision: https://reviews.llvm.org/D43067

llvm-svn: 326003
2018-02-24 00:35:21 +00:00
George Burgess IV 6f49f4a951 [MemorySSA] Remove a redundant dyn_cast.
StartingAccess is a MemoryUseOrDef. No need to check again.

llvm-svn: 326000
2018-02-24 00:15:21 +00:00
Craig Topper 7bcac492d4 [X86] Remove checks for '(scalar_to_vector (i8 (trunc GR32:)))' from scalar masked move patterns.
This portion can be matched by other patterns. We don't need it to make the larger pattern valid. It's sufficient to have a v1i1 mask input without caring where it came from.

llvm-svn: 325999
2018-02-24 00:15:05 +00:00
Stanislav Mekhanoshin b9704c001c [AMDGPU] Fixed madak.ll test on VI, added GFX10. NFC.
llvm-svn: 325995
2018-02-23 23:53:27 +00:00
Yonghong Song b68cef9dd0 bpf: New disassembler testcases for 32-bit subregister support
This patch test disassembler output for load/store instructions when
-mattr=+alu32 specified for which we want to use "w" register format.

Also, this patch extended the existing insn-unit.s and insn-unit-32.s to
make sure disassemblers for all other instructions are not affected.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325993
2018-02-23 23:49:35 +00:00
Yonghong Song c4ca879fac bpf: New codegen testcases for 32-bit subregister support
This patch adds some unit tests for 32-bit subregister support.
We want to make sure ALU32, subregister load/store and new peephole
optimization are truely enabled once -mattr=+alu32 specified.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325992
2018-02-23 23:49:33 +00:00
Yonghong Song 60fed1fef0 bpf: New optimization pass for eliminating unnecessary i32 promotions
This pass performs peephole optimizations to cleanup ugly code sequences at
MachineInstruction layer.

Currently, the only optimization in this pass is to eliminate type
promotion
sequences for zero extending 32-bit subregisters to 64-bit registers.

If the compiler could prove the zero extended source come from 32-bit
subregistere then it is safe to erase those promotion sequece, because the
upper half of the underlying 64-bit registers were zeroed implicitly
already.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325991
2018-02-23 23:49:32 +00:00
Yonghong Song ae961bb061 bpf: New decoder namespace for 32-bit subregister load/store
When -mattr=+alu32 passed to the disassembler, use decoder namespace for
32-bit subregister.

This is to disassemble load and store instructions in preferred B format
as described in previous commit:

      w = *(u8 *) (r + off) // BPF_LDX | BPF_B
      w = *(u16 *)(r + off) // BPF_LDX | BPF_H
      w = *(u32 *)(r + off) // BPF_LDX | BPF_W

      *(u8 *) (r + off) = w // BPF_STX | BPF_B
      *(u16 *)(r + off) = w // BPF_STX | BPF_H
      *(u32 *)(r + off) = w // BPF_STX | BPF_W

NOTE: all other instructions should still use the default decoder
      namespace.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325990
2018-02-23 23:49:31 +00:00
Yonghong Song ca31c3bb3f bpf: Enable 32-bit subregister support for -mattr=+alu32
After all those preparation patches, now we could enable 32-bit subregister
support once -mattr=+alu32 specified.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325989
2018-02-23 23:49:30 +00:00
Yonghong Song fcd1e0f625 bpf: Support 32-bit subregister in various InstrInfo hooks
This patch support 32-bit subregister in three InstrInfo hooks, i.e.
copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot,

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325988
2018-02-23 23:49:29 +00:00
Yonghong Song b1a52bd756 bpf: New instruction patterns for 32-bit subregister load and store
The instruction mapping between eBPF/arm64/x86_64 are:

         eBPF              arm64        x86_64
LD1   BPF_LDX | BPF_B       ldrb        movzbl
LD2   BPF_LDX | BPF_H       ldrh        movzwl
LD4   BPF_LDX | BPF_W       ldr         movl

movzbl/movzwl/movl on x86_64 accept 32-bit sub-register, for example %eax,
the same for ldrb/ldrh on arm64 which accept 32-bit "w" register. And
actually these instructions only accept sub-registers. There is no point
to have LD1/2/4 (unsigned) for 64-bit register, because on these arches,
upper 32-bits are guaranteed to be zeroed by hardware or VM, so load into
the smallest available register class is the best choice for maintaining
type information.

For eBPF we should adopt the same philosophy, to change current
format (A):

  r = *(u8 *) (r + off) // BPF_LDX | BPF_B
  r = *(u16 *)(r + off) // BPF_LDX | BPF_H
  r = *(u32 *)(r + off) // BPF_LDX | BPF_W

  *(u8 *) (r + off) = r // BPF_STX | BPF_B
  *(u16 *)(r + off) = r // BPF_STX | BPF_H
  *(u32 *)(r + off) = r // BPF_STX | BPF_W

into B:

  w = *(u8 *) (r + off) // BPF_LDX | BPF_B
  w = *(u16 *)(r + off) // BPF_LDX | BPF_H
  w = *(u32 *)(r + off) // BPF_LDX | BPF_W

  *(u8 *) (r + off) = w // BPF_STX | BPF_B
  *(u16 *)(r + off) = w // BPF_STX | BPF_H
  *(u32 *)(r + off) = w // BPF_STX | BPF_W

There is no change on encoding nor how should they be interpreted,
everything is as it is, load the specified length, write into low bits of
the register then zeroing all remaining high bits.

The only change is their associated register class and how compiler view
them.

Format A still need to be kept, because eBPF LLVM backend doesn't support
sub-registers at default, but once 32-bit subregister is enabled, it should
use format B.

This patch implemented this together with all those necessary extended load
and truncated store patterns.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325987
2018-02-23 23:49:28 +00:00
Yonghong Song 63cf273f55 bpf: Support i32 in getScalarShiftAmountTy method
getScalarShiftAmount method should be implemented for eBPF backend to make
sure shift amount could still get correct type once 32-bit subregisters
support are enabled.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325986
2018-02-23 23:49:26 +00:00
Yonghong Song 59fc805c7e bpf: Support condition comparison on i32
We need to support condition comparison on i32. All these comparisons are
supposed to be combined into BPF_J* instructions which only support i64.

For ISD::BR_CC we need to promote it to i64 first, then do custom lowering.

For ISD::SET_CC, just expand to SELECT_CC like what's been done for i64.

For ISD::SELECT_CC, we also want to do custom lower for i32. However, after
32-bit subregister support enabled, it is possible the comparison operands
are i32 while the selected value are i64, or the comparison operands are
i64 while the selected value are i32. We need to define extra instruction
pattern and support them in custom instruction inserter.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325985
2018-02-23 23:49:25 +00:00
Yonghong Song 219156cff0 bpf: Handle i32 for ALU operations without ISA support
There is no eBPF ISA support for BSWAP, ROTR, ROTL, SREM, SDIVREM, MULHU,
ADDC, ADDE etc on i32.

They could be emulated by other basic BPF_ALU operations, we'd set their
lowering action the same as i64.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325984
2018-02-23 23:49:24 +00:00
Yonghong Song 07a7a41753 bpf: New calling convention for 32-bit subregisters
This patch add new calling conventions to allow GPR32RegClass as valid
register class for arguments and return types.

New calling convention will only be choosen when -mattr=+alu32 specified.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325983
2018-02-23 23:49:23 +00:00
Yonghong Song 42389377d8 bpf: New target attribute "alu32" for 32-bit subregister support
This new attribute aims to control the enablement of 32-bit subregister
support on eBPF backend.

Name the interface as "alu32" is because we in particular want to enable
the generation of BPF_ALU32 instructions by enable subregister support.

This attribute could be used in the following format with llc:

  llc -mtriple=bpf -mattr=[+|-]alu32

It is disabled at default.

Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 325982
2018-02-23 23:49:22 +00:00