Commit Graph

25375 Commits

Author SHA1 Message Date
Luke Cheeseman 6db3a6a4a7 Revert r347490 as it breaks address sanitizer builds
llvm-svn: 347499
2018-11-23 17:13:06 +00:00
Luke Cheeseman d6dbd64104 Revert r343341
- Cannot reproduce the build failure locally and the build logs have
  been deleted.

llvm-svn: 347490
2018-11-23 11:01:47 +00:00
Craig Topper 0ec17884de [LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading towards scalarizing the type.
This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together.

But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code.

The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway.

llvm-svn: 347482
2018-11-23 02:32:13 +00:00
Craig Topper b239763384 [LegalizeVectorTypes] Have SplitVecOp_TruncateHelper fall back to SplitVecOp_UnaryOp if splitting the output type would be a legal type.
SplitVecOp_TruncateHelper tries to introduce a multilevel truncate to avoid scalarization. But if splitting the result type would still be a legal type we don't need to do that.

The comment block at the top of the function implied that this was already implemented. I looked back through the history and it doesn't look to have ever been checked.

llvm-svn: 347479
2018-11-22 22:56:52 +00:00
Sanjay Patel 3e80019275 [DAGCombiner] form 'not' ops ahead of shifts (PR39657)
We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'),
but that would not matter without this patch because DAGCombiner was 
reversing that transform. I think we need this transform in the backend 
regardless of what happens in IR to catch cases where the shift-xor 
is formed late from GEP or other ops.

https://rise4fun.com/Alive/NC1

  Name: shl
  Pre: (-1 << C2) == C1
  %shl = shl i8 %x, C2
  %r = xor i8 %shl, C1
  =>
  %not = xor i8 %x, -1
  %r = shl i8 %not, C2
  
  Name: shr
  Pre: (-1 u>> C2) == C1
  %sh = lshr i8 %x, C2
  %r = xor i8 %sh, C1
  =>
  %not = xor i8 %x, -1
  %r = lshr i8 %not, C2

https://bugs.llvm.org/show_bug.cgi?id=39657

llvm-svn: 347478
2018-11-22 19:24:10 +00:00
Reid Kleckner 86ada54e4c [mingw] Use unmangled name after the $ in the section name
GCC does it this way, and we have to be consistent. This includes
stdcall and fastcall functions with suffixes. I confirmed that a
fastcall function named "foo" ends up in ".text$foo", not
".text$@foo@8".

Based on a patch by Andrew Yohn!

Fixes PR39218.

Differential Revision: https://reviews.llvm.org/D54762

llvm-svn: 347431
2018-11-21 22:01:10 +00:00
Sanjay Patel 20935e0ab5 [DAGCombiner] refactor select-of-FP-constants transform
This transform needs to be limited. 

We are converting to a constant pool load very early, and we 
are turning loads that are independent of the select condition 
(and therefore speculatable) into a dependent non-speculatable 
load.

We may also be transferring a condition code from an FP register
to integer to create that dependent load.

llvm-svn: 347424
2018-11-21 20:54:47 +00:00
Sanjay Patel 1c74747478 [DAGCombiner] reduce code duplication; NFC
llvm-svn: 347410
2018-11-21 20:00:32 +00:00
Simon Pilgrim 5448339889 [TargetLowering] SimplifyDemandedBits - only reduce known bits for integer constants
Avoids fuzzing crash found by Mikael Holmén.

llvm-svn: 347393
2018-11-21 14:26:19 +00:00
Nikita Popov c5b152bdef Test commit: Delete trailing space in comment
llvm-svn: 347385
2018-11-21 10:57:22 +00:00
Sanjay Patel 357053f289 [DAGCombiner] look through bitcasts when trying to narrow vector binops
This is another step in vector narrowing - a follow-up to D53784
(and hoping to eventually squash potential regressions seen in
D51553).

The x86 test diffs are wins, but the AArch64 diff is probably not.
That problem already exists independent of this patch (see PR39722), but it
went unnoticed in the previous patch because there were no regression tests
that showed the possibility.

The x86 diff in i64-mem-copy.ll is close. Given the frequency throttling
concerns with using wider vector ops, an extra extract to reduce vector
width is the right trade-off at this level of codegen.

Differential Revision: https://reviews.llvm.org/D54392

llvm-svn: 347356
2018-11-20 22:26:35 +00:00
Zachary Turner c68f895702 [CodeView] Add support for ref-qualified member functions.
When you have a member function with a ref-qualifier, for example:

struct Foo {
  void Func() &;
  void Func2() &&;
};

clang-cl was not emitting this information. Doing so is a bit
awkward, because it's not a property of the LF_MFUNCTION type, which
is what you'd expect. Instead, it's a property of the this pointer
which is actually an LF_POINTER. This record has an attributes
bitmask on it, and our handling of this bitmask was all wrong. We
had some parts of the bitmask defined incorrectly, but importantly
for this bug, we didn't know about these extra 2 bits that represent
the ref qualifier at all.

Differential Revision: https://reviews.llvm.org/D54667

llvm-svn: 347354
2018-11-20 22:13:43 +00:00
Zachary Turner 3826566c04 [CodeView] Mark this pointers as const.
This is for compatibility with MSVC, which also marks this pointers
as being const-qualified.

Fixes llvm.org/pr36526

Differential Revision: https://reviews.llvm.org/D54736

llvm-svn: 347353
2018-11-20 22:13:23 +00:00
Simon Pilgrim 3735105961 [DAGCombine] Add calls to SimplifyDemandedVectorElts from visitINSERT_SUBVECTOR (PR37989)
This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices.

llvm-svn: 347313
2018-11-20 15:23:50 +00:00
Simon Pilgrim b356d0463e [TargetLowering] Improve SimplifyDemandedVectorElts/SimplifyDemandedBits support
For bitcast nodes from larger element types, add the ability for SimplifyDemandedVectorElts to call SimplifyDemandedBits by merging the elts mask to a bits mask.

I've raised https://bugs.llvm.org/show_bug.cgi?id=39689 to deal with the few places where SimplifyDemandedBits's lack of vector handling is a problem.

Differential Revision: https://reviews.llvm.org/D54679

llvm-svn: 347301
2018-11-20 12:02:16 +00:00
Craig Topper 4954c66430 [SelectionDAG] Compute known bits and num sign bits for live out vector registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks
Summary:
We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply.

This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362

Reviewers: spatel, efriedma, RKSimon, arsenm

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D54725

llvm-svn: 347287
2018-11-20 04:30:26 +00:00
Sanjay Patel a36c444471 [DAGCombiner] reduce code duplication in visitXOR; NFC
llvm-svn: 347278
2018-11-20 00:51:45 +00:00
Stanislav Mekhanoshin 54ebfe8aee Implement computeKnownBits for scalar_to_vector
Differential Revision: https://reviews.llvm.org/D54728

llvm-svn: 347274
2018-11-19 23:34:07 +00:00
Simon Pilgrim 740122fb8c [DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI operands (PR21207)
Consistently use (!LegalOperations || isOperationLegalOrCustom) for all node pairs.

Differential Revision: https://reviews.llvm.org/D53478

llvm-svn: 347255
2018-11-19 19:37:59 +00:00
Simon Pilgrim 01f4c4be90 Fix Wdocumentation warning. NFCI.
llvm-svn: 347253
2018-11-19 19:18:33 +00:00
Simon Pilgrim 493ac5320b Fix unused function warning.
llvm-svn: 347252
2018-11-19 19:18:00 +00:00
Simon Pilgrim de3605f56b [TargetLowering] expandFP_TO_UINT - improve fp16 support
As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode.

I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary.

Differential Revision: https://reviews.llvm.org/D54703

llvm-svn: 347251
2018-11-19 19:16:13 +00:00
Simon Pilgrim 9ad5717fcc Add missing stream operator for Polynomial class to fix debug builds.
llvm-svn: 347249
2018-11-19 18:57:49 +00:00
Martin Elshuber 5a47dc607e [InterleavedLoadCombine] Fix warnings
* remove unused function
* fix compare

llvm-svn: 347241
2018-11-19 18:35:31 +00:00
Paul Robinson cda5421016 [DebugInfo] DISubprogram flags get their own flags word. NFC.
This will hold flags specific to subprograms. In the future
we could potentially free up scarce bits in DIFlags by moving
subprogram-specific flags from there to the new flags word.

This patch does not change IR/bitcode formats, that will be
done in a follow-up.

Differential Revision: https://reviews.llvm.org/D54597

llvm-svn: 347239
2018-11-19 18:29:28 +00:00
Martin Elshuber 026a2a5152 [InterleavedLoadCombine] Fix warning unused variable
Differential Revision: https://reviews.llvm.org/D52653

llvm-svn: 347229
2018-11-19 17:11:48 +00:00
Sanjay Patel b25adf5edb [SelectionDAG] simplify vector select with undef operand(s)
llvm-svn: 347227
2018-11-19 17:06:05 +00:00
Benjamin Kramer 22a04efcb0 [InterleavedLoadCombine] Remove unused include. NFC.
llvm-svn: 347226
2018-11-19 17:01:19 +00:00
Sanjay Patel a1dca3553e [SelectionDAG] simplify select FP with undef condition
llvm-svn: 347212
2018-11-19 14:42:28 +00:00
Sanjay Patel c036d844be [SelectionDAG] add simplifySelect() to reduce code duplication; NFC
This should be extended to handle FP and vectors in follow-up patches.

llvm-svn: 347210
2018-11-19 14:35:22 +00:00
Martin Elshuber fef3036d37 Subject: [PATCH] [CodeGen] Add pass to combine interleaved loads.
This patch defines an interleaved-load-combine pass. The pass searches
for ShuffleVector instructions that represent interleaved loads. Matches are
converted such that they will be captured by the InterleavedAccessPass.

The pass extends LLVMs capabilities to use target specific instruction
selection of interleaved load patterns (e.g.: ld4 on Aarch64
architectures).

Differential Revision: https://reviews.llvm.org/D52653

llvm-svn: 347208
2018-11-19 14:26:10 +00:00
Serge Guelton 12c7a96064 Fix disturbing warning - NFCI
llvm-svn: 347186
2018-11-19 10:05:28 +00:00
Vedant Kumar e7b789b529 [ProfileSummary] Standardize methods and fix comment
Every Analysis pass has a get method that returns a reference of the Result of
the Analysis, for example, BlockFrequencyInfo
&BlockFrequencyInfoWrapperPass::getBFI().  I believe that
ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning
a pointer.

Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock,
respectively.  Most methods use BB as the argument of variable names while
methods usually refer to Basic Blocks as Blocks, instead of BB.  For example,
Function::getEntryBlock, Loop:getExitBlock, etc.

I also fixed one of the comments.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D54669

llvm-svn: 347182
2018-11-19 05:23:16 +00:00
Sanjay Patel 8c0cd77bff [DAG] add undef simplifications for select nodes
Sadly, this duplicates (twice) the logic from InstSimplify. There
might be some way to at least share the DAG versions of the code, 
but copying the folds seems to be the standard method to ensure 
that we don't miss these folds. 

Unlike in IR, we don't run DAGCombiner to fixpoint, so there's no 
way to ensure that we do these kinds of simplifications unless the 
code is repeated at node creation time and during combines.

There were other tests that would become worthless with this
improvement that I changed as pre-commits:
rL347161
rL347164
rL347165
rL347166
rL347167

I'm not sure how to salvage the remaining tests (diffs in this patch).
So the x86 tests verify that the new code is working as intended.
The AMDGPU test is actually similar to my motivating case: we have
some undef value that has survived to machine IR in an x86 test, and 
then it gets folded in some weird way, or we crash if we don't transfer
the undef flag. But we would have been better off never getting to that
point by doing these simplifications.

This will lead back to PR32023 someday...
https://bugs.llvm.org/show_bug.cgi?id=32023

llvm-svn: 347170
2018-11-18 17:36:23 +00:00
Sanjay Patel 42c22a1f87 [SelectionDAG] simplify code; NFC
llvm-svn: 347160
2018-11-18 14:39:03 +00:00
Fangrui Song 7570932977 Use llvm::copy. NFC
llvm-svn: 347126
2018-11-17 01:44:25 +00:00
Stanislav Mekhanoshin 0ff7c8309d DAG combiner: fold (select, C, X, undef) -> X
Differential Revision: https://reviews.llvm.org/D54646

llvm-svn: 347110
2018-11-16 23:13:38 +00:00
Craig Topper 9e97054211 [LegalizeVectorOps] After custom legalizing an extending load or a truncating store, make sure the custom code is also legal.
For example, on X86 we emit a sign_extend_vector_inreg from LowerLoad and without sse4.1 this node will need further legalization. Previously this sign_extend_vector_inreg was being custom lowered during DAG legalization instead of vector op legalization.

Unfortunately, this doesn't seem to matter for the output of any existing lit tests.

llvm-svn: 347094
2018-11-16 21:04:58 +00:00
Reid Kleckner 755577168a [codeview] Expose -gcodeview-ghash for global type hashing
Summary:
Experience has shown that the functionality is useful. It makes linking
optimized clang with debug info for me a lot faster, 20s to 13s. The
type merging phase of PDB writing goes from 10s to 3s.

This removes the LLVM cl::opt and replaces it with a metadata flag.

After this change, users can do the following to use ghash:
- add -gcodeview-ghash to compiler flags
- replace /DEBUG with /DEBUG:GHASH in linker flags

Reviewers: zturner, hans, thakis, takuto.ikuta

Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D54370

llvm-svn: 347072
2018-11-16 18:47:41 +00:00
Simon Pilgrim 3c8baf4f90 [TargetLowering] Cleanup more of the EXTEND demanded bits cases so that they match. NFCI.
Use the same variable names etc.

llvm-svn: 347045
2018-11-16 12:26:26 +00:00
Sam Parker ab99cfab21 [DAGCombine] Fix non-deterministic debug output
PR37970 reported non-deterministic debug output, this was caused by
iterating through a set and not a a vector.

bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37970

Differential Revision: https://reviews.llvm.org/D54570

llvm-svn: 347037
2018-11-16 08:35:19 +00:00
Craig Topper bac7d9735a [LegalizeVectorTypes] Teach WidenVecRes_Convert to turn ANY_EXTEND into ANY_EXTEND_VECTOR_INREG when the input and output types need to be widened to the same width.
If we don't do it here, DAGCombine will just end up creating it from the scalar any_extend+build_vector so might as well save a step.

llvm-svn: 347034
2018-11-16 07:13:34 +00:00
Heejin Ahn 095796a391 [WebAssembly] Split BBs after throw instructions
Summary:
`throw` instruction is a terminator in wasm, but BBs were not splitted
after `throw` instructions, causing machine instruction verifier to
fail.

This patch
- Splits BBs after `throw` instructions in WasmEHPrepare and adding an
  unreachable instruction after `throw`, which will be deleted in
  LateEHPrepare pass
- Refactors WasmEHPrepare into two member functions
- Changes the semantics of `eraseBBsAndChildren` in LateEHPrepare pass
  to match that of WasmEHPrepare pass, which is newly added. Now
  `eraseBBsAndChildren` does not delete BBs with remaining predecessors.
- Fixes style nits, making static function names conform to clang-tidy
- Re-enables the test temporarily disabled by rL346840 && rL346845

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54571

llvm-svn: 347003
2018-11-16 00:47:18 +00:00
Jessica Paquette ddb039a199 [MachineOutliner][NFC] Check if CandidatesForRepeatedSeq < 2
There's no reason to call getOutliningCandidateInfo with a single candidate.

llvm-svn: 346913
2018-11-15 00:02:24 +00:00
Nirav Dave 1241dcb3cf Bias physical register immediate assignments
The machine scheduler currently biases register copies to/from
physical registers to be closer to their point of use / def to
minimize their live ranges. This change extends this to also physical
register assignments from immediate values.

This causes a reduction in reduction in overall register pressure and
minor reduction in spills and indirectly fixes an out-of-registers
assertion (PR39391).

Most test changes are from minor instruction reorderings and register
name selection changes and direct consequences of that.

Reviewers: MatzeB, qcolombet, myatsina, pcc

Subscribers: nemanjai, jvesely, nhaehnle, eraman, hiraditya,
  javed.absar, arphaman, jfb, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D54218

llvm-svn: 346894
2018-11-14 21:11:53 +00:00
Heejin Ahn da419bdb5e [WebAssembly] Add support for the event section
Summary:
This adds support for the 'event section' specified in the exception
handling proposal. (This was named 'exception section' first, but later
renamed to 'event section' to take possibilities of other kinds of
events into consideration. But currently we only store exception info in
this section.)

The event section is added between the global section and the export
section. This is for ease of validation per request of the V8 team.

This patch:
- Creates the event symbol type, which is a weak symbol
- Makes 'throw' instruction take the event symbol '__cpp_exception'
- Adds relocation support for events
- Adds WasmObjectWriter / WasmObjectFile (Reader) support
- Adds obj2yaml / yaml2obj support
- Adds '.eventtype' printing support

Reviewers: dschuff, sbc100, aardappel

Subscribers: jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D54096

llvm-svn: 346825
2018-11-14 02:46:21 +00:00
Eli Friedman 6bdabcf368 [CodeGen] Fix forward scan in MachineBasicBlock::computeRegisterLiveness.
The scan was incorrectly skipping the first instruction, so a register
could appear to be dead when it was actually live. This eventually leads
to a machine verifier failure and miscompile in arm-ldst-opt.

Differential Revision: https://reviews.llvm.org/D54491

llvm-svn: 346821
2018-11-14 00:39:29 +00:00
Jessica Paquette cad864d49e [MachineOutliner][NFC] Use MBB flags to avoid call checks in getOutliningInfo
We already determine a bunch of information about an MBB in
getMachineOutlinerMBBFlags. We can reuse that information to avoid calculating
things that must be false/true.

The first thing we can easily check is if an outlined sequence could ever
contain calls. There's no reason to walk over the outlined range, checking for
calls, if we already know that there are no calls in the block containing the
sequence.

llvm-svn: 346809
2018-11-13 23:01:34 +00:00
Jessica Paquette b2d53c5d7d [MachineOutliner][NFC] Exit getOutliningType if there are < 2 candidates
Since we never outline anything with fewer than 2 occurrences, there's no
reason to compute cost model information if there's less than that.

llvm-svn: 346803
2018-11-13 22:16:27 +00:00
Stanislav Mekhanoshin 35de877e8c Fixed DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT i1 handling
Legalizer used to request an ext load from i8 to i1 when promoting
vector element type to i8. Fixed.

Differential Revision: https://reviews.llvm.org/D54440

llvm-svn: 346795
2018-11-13 20:26:27 +00:00
Fangrui Song d8fd0ec032 [AsmPrinter] Rename a comment of .debug_gnu_pubnames entry
Summary:
The comment refers to the field as "Kind:". However, in gdb,

https://sourceware.org/gdb//onlinedocs/gdb/Index-Section-Format.html names it "attributes",
gdb/dwarf2read.c:dw2_symtab_iter_next refers to the whole value as "cu_index_and_attrs"

Change it to `Attributes:` for consistency.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: aprantl, JDevlieghere, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54480

llvm-svn: 346790
2018-11-13 20:18:08 +00:00
David Blaikie bb279116f2 DebugInfo: Add a CU metadata attribute for use of DWARF ranges base address specifiers
Summary:
Ranges base address specifiers can save a lot of object size in
relocation records especially in optimized builds.

For an optimized self-host build of Clang with split DWARF and debug
info compression in object files, but uncompressed debug info in the
executable, this change produces about 18% smaller object files and 6%
larger executable.

While it would've been nice to turn this on by default, gold's 32 bit
gdb-index support crashes on this input & I don't think there's any
perfect heuristic to implement solely in LLVM that would suffice - so
we'll need a flag one way or another (also possible people might want to
aggressively optimized for executable size that contains debug info
(even with compression this would still come at some cost to executable
size)) - so let's plumb it through.

Differential Revision: https://reviews.llvm.org/D54242

llvm-svn: 346788
2018-11-13 20:08:10 +00:00
Craig Topper aca8390216 [SelectionDAG][X86] Relax restriction on the width of an input to *_EXTEND_VECTOR_INREG. Use them and regular *_EXTEND to replace the X86 specific VSEXT/VZEXT opcodes
Previously, the extend_vector_inreg opcode required their input register to be the same total width as their output. But this doesn't match up with how the X86 instructions are defined. For X86 the input just needs to be a legal type with at least enough elements to cover the output.

This patch weakens the check on these nodes and allows them to be used as long as they have more input elements than output elements. I haven't changed type legalization behavior so it will still create them with matching input and output sizes.

X86 will custom legalize these nodes by shrinking the input to be a 128 bit vector and once we've done that we treat them as legal operations. We still have one case during type legalization where we must custom handle v64i8 on avx512f targets without avx512bw where v64i8 isn't a legal type. In this case we will custom type legalize to a *extend_vector_inreg with a v16i8 input. After that the input is a legal type so type legalization should ignore the node and doesn't need to know about the relaxed restriction. We are no longer allowed to use the default expansion for these nodes during vector op legalization since the default expansion uses a shuffle which required the widths to match. Custom legalization for all types will prevent us from reaching the default expansion code.

I believe DAG combine works correctly with the released restriction because it doesn't check the number of input elements.

The rest of the patch is changing X86 to use either the vector_inreg nodes or the regular zero_extend/sign_extend nodes. I had to add additional isel patterns to handle any_extend during isel since simplifydemandedbits can create them at any time so we can't legalize to zero_extend before isel. We don't yet create any_extend_vector_inreg in simplifydemandedbits.

Differential Revision: https://reviews.llvm.org/D54346

llvm-svn: 346784
2018-11-13 19:45:21 +00:00
Cameron McInally cbde0d9c7b [IR] Add a dedicated FNeg IR Instruction
The IEEE-754 Standard makes it clear that fneg(x) and
fsub(-0.0, x) are two different operations. The former is a bitwise
operation, while the latter is an arithmetic operation. This patch
creates a dedicated FNeg IR Instruction to model that behavior.

Differential Revision: https://reviews.llvm.org/D53877

llvm-svn: 346774
2018-11-13 18:15:47 +00:00
Alexander Kornienko 3635c89070 Fix uninitialized variable.
Flags variable was not initialized and later used (both isMBBSafeToOutlineFrom
implementations assume it's initialized), which breaks
test/CodeGen/AArch64/machine-outliner.mir. under memory sanitizer:
MemorySanitizer: use-of-uninitialized-value
    #0  in llvm::AArch64InstrInfo::getOutliningType(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&, unsigned int) const llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5494:9
    #1  in (anonymous namespace)::InstructionMapper::convertToUnsignedVec(llvm::MachineBasicBlock&, llvm::TargetInstrInfo const&) llvm/lib/CodeGen/MachineOutliner.cpp:772:19
    #2  in (anonymous namespace)::MachineOutliner::populateMapper((anonymous namespace)::InstructionMapper&, llvm::Module&, llvm::MachineModuleInfo&) llvm/lib/CodeGen/MachineOutliner.cpp:1543:14
    #3  in (anonymous namespace)::MachineOutliner::runOnModule(llvm::Module&) llvm/lib/CodeGen/MachineOutliner.cpp:1645:3
    #4  in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1744:27
    #5  in llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1857:44
    #6  in compileModule(char**, llvm::LLVMContext&) llvm/tools/llc/llc.cpp:597:8

llvm-svn: 346761
2018-11-13 16:41:05 +00:00
Craig Topper 0b33b468a1 [DAGCombiner] Enable tryToFoldExtendOfConstant to run after legalize vector ops
It should be ok to create a new build_vector after legal operations so long as it doesn't cause an infinite loop in DAG combiner.

Unfortunately, X86's custom constant folding in combineVSZext is hiding any test changes from this. But I'm trying to get to a point where that X86 specific code isn't necessary at all.

Differential Revision: https://reviews.llvm.org/D54285

llvm-svn: 346728
2018-11-13 01:59:32 +00:00
Jessica Paquette 82d9c0a3fa [MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to isMBBSafeToOutlineFrom
Instead of returning Flags, return true if the MBB is safe to outline from.

This lets us check for unsafe situations, like say, in AArch64, X17 is live
across a MBB without being defined in that MBB. In that case, there's no point
in performing an instruction mapping.

llvm-svn: 346718
2018-11-12 23:51:32 +00:00
Philip Reames e44a55dc98 [GC][NFC] Simplify code now that we only have one safepoint kind
This is the NFC follow up to exploit the semantic simplification from r346701

llvm-svn: 346712
2018-11-12 22:03:53 +00:00
Ali Tamur d482b01a62 Use a data structure better suited for large sets in SimplificationTracker.
Summary:
D44571 changed SimplificationTracker to use SmallSetVector to keep phi nodes. As a result, when the number of phi nodes is large, the build time performance suffers badly. When building for power pc, we have a case where there are more than 600.000 nodes, and it takes too long to compile.

In this change, I partially revert D44571 to use SmallPtrSet, which does an acceptable job with any number of elements. In the original patch, having a deterministic iteration order was mentioned as a motivation, however I think it only applies to the nodes already matched in MatchPhiSet method, which I did not touch.

Reviewers: bjope, skatkov

Reviewed By: bjope, skatkov

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D54007

llvm-svn: 346710
2018-11-12 21:43:43 +00:00
Philip Reames c75a0c3f69 [GC] Remove so called PreCall safepoints
Remove another bit of unused configuration potential from GCStrategy.  It's not entirely clear what the intention here was, but from the docs, it sounds like this may have been subsumed by patchable call support.

Note: This change is deliberately small to make it clear that while implemented, there's nothing using the option.  A following NFC will do most of the simplifications.
llvm-svn: 346701
2018-11-12 20:15:34 +00:00
Stanislav Mekhanoshin 5f9513147a Fix MachineInstr::findRegisterUseOperandIdx subreg checks
The function only checks that instruction reads a super-register
containing requested physical register. In case if a sub-register
if being read that is also a use of a super-reg, so added the check.
In particular MI->readsRegister() is broken because of the missing
check. The resulting check is essentially regsOverlap().

Differential Revision: https://reviews.llvm.org/D54128

llvm-svn: 346686
2018-11-12 18:12:28 +00:00
Jessica Paquette 9702144341 [MachineOutliner][NFC] Early exit pruning when candidates don't share an MBB
There's no way they can overlap in this case.

This can save a few iterations when the candidate is close to the beginning
of a MachineBasicBlock. It's particularly useful when the average length of
a MachineBasicBlock in the program is small.

llvm-svn: 346682
2018-11-12 17:50:56 +00:00
Jessica Paquette 3954272ac1 [MachineOutliner][NFC] Put suffix tree in buildCandidateList
It's only used there, so it doesn't make much sense to have it in runOnModule.

llvm-svn: 346681
2018-11-12 17:50:55 +00:00
Paul Robinson 5b302bfc8e [DWARFv5] Emit split type units in .debug_info.dwo.
Differential Revision: https://reviews.llvm.org/D54350

llvm-svn: 346674
2018-11-12 16:55:11 +00:00
Nirav Dave a395e2df56 [DAGCombiner] Fix load-store forwarding of indexed loads.
Summary:
Handle extra output from index loads in cases where we wish to
forward a load value directly from a preceeding store.

Fixes PR39571.

Reviewers: peter.smith, rengolin

Subscribers: javed.absar, hiraditya, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D54265

llvm-svn: 346654
2018-11-12 14:05:40 +00:00
Philip Reames 8b48ceac80 [GC] Remove unused configuration variable
The custom root mechanism didn't actually do anything.  ShadowStackGC, the only one which used it, just removed the gcroots before they reached the normal lowering in SelectionDAG.  As a result, the state flag had no value.

llvm-svn: 346632
2018-11-12 02:34:54 +00:00
Philip Reames 1559021751 [GC] Minor style modernization
llvm-svn: 346631
2018-11-12 02:26:26 +00:00
Philip Reames 18945d6c99 [GCRoot] Remove some unneccessary complexity
The GCStrategy provides three configuration options were are largely redundant.

1) Support for conditionally lowering gcread and gcwrite to loads and stores.  This is redundant since any GC which wished to use these abstractions would lower them out of existance before the built in lowering anyways.  As such, there's no need to have the lowering being conditional.
2) Conditional initialization for allocas marked via gcroot.  Semantically, roots have to be initialized before first potential use.  Arguably, the frontend really should have responsibility for that, but the old API allowed the frontend to ignore this detail.  Only one builtin GC used the non-initializing mode.  Since no one to my knowledge actually uses the ErlangGC strategy, I decide the slight pessimization was worth the simplicity.  If that turns out to be problematic, we can always improve the insertion algorithm to detect more existing initializing stores.

llvm-svn: 346621
2018-11-11 21:13:09 +00:00
Craig Topper d23cdbbeb2 [DAGCombiner] Make tryToFoldExtendOfConstant return an SDValue instead of an SDNode*. NFC
Removes the need to call getNode internally and to recreate an SDValue after the call.

llvm-svn: 346600
2018-11-10 23:46:03 +00:00
Sanjay Patel 0a515595a7 [x86] allow vector load narrowing with multi-use values
This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more
opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs.
Apart from 2-3 strange cases, these are all wins.

I've structured this to be no-functional-change-intended for any target except for x86
because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those
targets have existing regression tests (4, 4, 10 files respectively) that would be
affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show
any regression test diffs. The trade-off is deciding if an extra vector load is better
than a single wide load + extract_subvector.

For x86, this is almost always better (on paper at least) because we often can fold
loads into subsequent ops and not increase the official instruction count. There's also
some unknown -- but potentially large -- benefit from using narrower vector ops if wide
ops are implemented with multiple uops and/or frequency throttling is avoided.

Differential Revision: https://reviews.llvm.org/D54073

llvm-svn: 346595
2018-11-10 20:05:31 +00:00
Philip Reames 9b8c102675 [GC] Rename a header for consistency
llvm-svn: 346588
2018-11-10 16:08:10 +00:00
Matthias Braun fb93aecf8d RegAllocFast: Further cleanups; NFC
llvm-svn: 346576
2018-11-10 00:36:27 +00:00
Philip Reames afa1742b4b [GC] Simplify linking of GC builtin GC strategies
llvm-svn: 346569
2018-11-09 23:56:21 +00:00
Craig Topper f2e65f8636 [SelectionDAG] Fix a -Wparentheses warning from gcc in an assert. NFC
gcc wants parentheses around the logical OR since there is a logical AND for the string.

llvm-svn: 346564
2018-11-09 23:11:30 +00:00
Paul Robinson ddbde9a4ad [DWARFv5] Emit normal type units in .debug_info comdats.
Differential Revision: https://reviews.llvm.org/D54282

llvm-svn: 346540
2018-11-09 19:06:09 +00:00
Craig Topper 9a7e19b8f2 [DAGCombiner][X86][Mips] Enable combineShuffleOfScalars to run between vector op legalization and DAG legalization. Fix bad one use check in combineShuffleOfScalars
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.

I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.

Differential Revision: https://reviews.llvm.org/D54283

llvm-svn: 346530
2018-11-09 18:04:34 +00:00
Serge Guelton 86f8b70f1b Type safe version of MachinePassRegistry
Previous version used type erasure through a `void* (*)()` pointer,
which triggered gcc warning and implied a lot of reinterpret_cast.

This version should make it harder to hit ourselves in the foot.

Differential revision: https://reviews.llvm.org/D54203

llvm-svn: 346522
2018-11-09 17:19:45 +00:00
Zaara Syeda 5c179bf14b [Power9] Allow gpr callee saved spills in prologue to vectors registers
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.

Differential Revision: https://reviews.llvm.org/D39386

llvm-svn: 346512
2018-11-09 16:36:24 +00:00
Alexandros Lamprineas e15c982f6d [SelectionDAG] swap select_cc operands to enable folding
The DAGCombiner tries to SimplifySelectCC as follows:

  select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)

It can't cope with the situation of reordered operands:

  select_cc(x, y, 0, 16, cc)

In that case we just need to swap the operands and invert the Condition Code:

  select_cc(x, y, 16, 0, ~cc)

Differential Revision: https://reviews.llvm.org/D53236

llvm-svn: 346484
2018-11-09 11:09:40 +00:00
Craig Topper 8cca8bd4aa [SelectionDAG] Assert on the width of DemandedElts argument to computeKnownBits for all vector typed operations not just build_vector.
Fix AArch64 unit test that fails with the assertion added.

llvm-svn: 346437
2018-11-08 20:29:17 +00:00
Nirav Dave 6ce9f72f76 [DAGCombine] Improve alias analysis for chain of independent stores.
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.

This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.

Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk

Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53552

llvm-svn: 346432
2018-11-08 19:14:20 +00:00
David Blaikie c8f7e6c1a9 NFC: DebugInfo: Track the origin CU rather than just the base address for range lists
Turns out knowing more than just the base address might be useful -
specifically a future change to respect a DICompileUnit flag for the use
of base address specifiers in DWARF < 5.

llvm-svn: 346380
2018-11-08 00:35:54 +00:00
Jessica Paquette c4cf775ae0 [MachineOutliner][NFC] Only map blocks which have adjacent legal instructions
If a block doesn't have any ranges of adjacent legal instructions, then it
can't have outlining candidates. There's no point in mapping legal isntructions
in situations like this.

I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at
-Oz by about 3%.

llvm-svn: 346379
2018-11-08 00:33:38 +00:00
Jessica Paquette 267d266c29 [MachineOutliner][NFC] Don't map MBBs that don't contain legal instructions
I noticed that there are lots of basic blocks that don't have enough legal
instructions in them to warrant outlining. We can skip mapping these entirely.

In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of
the total nodes in the suffix tree. These nodes can never be part of a
repeated substring, and so they don't impact the result at all.

Before this, there were 62128 nodes in the tree for sqlite3. After this, there
are 56457 nodes.

llvm-svn: 346373
2018-11-08 00:02:11 +00:00
Jessica Paquette df5b09b8ce [MachineOutliner][NFC] Remove Parent field from SuffixTreeNode
This is only used for calculating ConcatLen. This isn't necessary,
since it's easily derived from the traversal setting suffix indices.

Remove that. Rename CurrIdx to CurrNodeLen to better describe what's
going on.

llvm-svn: 346349
2018-11-07 19:56:13 +00:00
Jessica Paquette a409cc959b [MachineOutliner][NFC] Traverse suffix tree using a RepeatedSubstring iterator
This takes the traversal methods introduced in r346269 and adapts them
into an iterator. This allows the outliner to iterate over repeated substrings
within the suffix tree directly without having to initially find all of the
substrings and then iterate over them after you've found them.

llvm-svn: 346345
2018-11-07 19:20:55 +00:00
Jessica Paquette a3eb0fac3b [MachineOutliner] Don't store outlined function numberings on OutlinedFunction
NFC-ish. This doesn't change the behaviour of the outliner, but does make sure
that you won't end up with say

OUTLINED_FUNCTION_2:
...
ret

OUTLINED_FUNCTION_248:
...
ret

as the only outlined functions in your module. Those should really be

OUTLINED_FUNCTION_0:
...
ret

OUTLINED_FUNCTION_1:
...
ret

If we produce outlined functions, they probably should have sequential numbers
attached to them. This makes it a bit easier+stable to write outliner tests.

The point of this is to move towards a bit more stability in outlined function
names. By doing this, we at least don't rely on the traversal order of the
suffix tree. Instead, we rely on the order of the candidate list, which is
*far* more consistent. The candidate list is ordered by the end indices of
candidates, so we're more likely to get a stable ordering. This is still
susceptible to changes in the cost model though (like, if we suddenly find new
candidates, for example).

llvm-svn: 346340
2018-11-07 18:36:43 +00:00
Serge Guelton a4d9e2293a Fix ignorded type qualifier warning [NFC]
llvm-svn: 346332
2018-11-07 16:17:30 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Matthias Braun 5b7c90b4e2 RegAllocFast: Leave unassigned virtreg entries in map
Set `LiveReg::PhysReg` to zero when freeing a register instead of
removing it from the entry from `LiveRegMap`. This way no iterators get
invalidated and we can avoid passing around and updating iterators all
over the place.

This does not change any allocator decisions. It is not completely NFC
because the arbitrary iteration order through `LiveRegMap` in
`spillAll()` changes so we may get a different order in those spill
sequences (the amount of spills does not change).

This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346298
2018-11-07 06:57:03 +00:00
Matthias Braun b0ecbef428 RegAllocFast: Further cleanups; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346297
2018-11-07 06:57:02 +00:00
Matthias Braun 0804dca358 RegAllocFast: Refactor PhysRegState usage; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346296
2018-11-07 06:57:00 +00:00
Matthias Braun b4c76ff77c RegAllocFast: Factor spill/reload creation into their own functions; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346289
2018-11-07 02:04:12 +00:00
Matthias Braun ebcf5437bc RegAllocFast: Cleanups; NFC
This is in preparation of https://reviews.llvm.org/D52010.

llvm-svn: 346288
2018-11-07 02:04:11 +00:00
Matthias Braun 14af82a608 RegAllocFast: Rename statistic from NumCopies to NumCoalesced
The metric does not return the number of remaining (or inserted) copies
but the number of copies that were coalesced. Pick a more descriptive
name.

llvm-svn: 346287
2018-11-07 02:04:07 +00:00
Jessica Paquette 935d373db9 [MachineOutliner][NFC] Remove OccurrenceCount from SuffixTreeNode
After changing the way we find candidates in r346269, this is no longer used.

llvm-svn: 346275
2018-11-06 22:23:13 +00:00
Jessica Paquette 979cf1e566 [MachineOutliner][NFC] Remove IsInTree from SuffixTreeNode
After changing the way we find repeated substrings in r346269, this
field is no longer used by anything, so it can be removed.

llvm-svn: 346274
2018-11-06 22:21:11 +00:00
Jessica Paquette 4e54ef8883 [MachineOutliner][NFC] Add findRepeatedSubstrings to SuffixTree, kill LeafVector
Instead of iterating over the leaves to find repeated substrings, and walking
collecting leaf children when we don't necessarily need them, let's just
calculate what we need and iterate over that.

By doing this, we don't have to save every leaf. It's easier to read the code
too and understand what's going on.

The goal here, at the end of the day, is to set up to allow us to do something
like

for (RepeatedSubstring &RS : ST) {
 ... do stuff with RS ...
}

Which would let us perform the cost model stuff and the repeated substring
query at the same time.

llvm-svn: 346269
2018-11-06 21:46:41 +00:00
Matthias Braun c6613879ce LivePhysRegs/IfConversion: Change some types from unsigned to MCPhysReg; NFC
Change the type in a couple of lists and sets that only store physical
registers from unsigned to MCPhysRegs. The later is only 16bits and
saves us a bit of memory.

llvm-svn: 346254
2018-11-06 19:00:11 +00:00
Matthias Braun 7a75a91b5b MachineFunction: Store more specific reference to LLVMTargetMachine; NFC
MachineFunction can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.

Do the same for references in ScheduleDAG and RegUsageInfoCollector.

llvm-svn: 346183
2018-11-05 23:49:14 +00:00