Commit Graph

23391 Commits

Author SHA1 Message Date
Bob Haarman a88bce1bce [NFC] clang-format llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
llvm-svn: 312035
2017-08-29 21:01:55 +00:00
Bob Haarman 223303c5a3 Reland r311957 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

The reland adds two extra checks to the original: It checks if the
DbgVariableLocation is valid before checking any of its fields, and
it only emits ranges with nonzero registers.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 312034
2017-08-29 20:59:25 +00:00
Hans Wennborg e7becd7e85 [DAG] Bound loop dependence check in merge optimization.
The loop dependence check looks for dependencies between store merge
candidates not captured by the chain sub-DAG doing a check of
predecessors which may be very large. Conservatively bound number of
nodes checked for compilation time. (Resolves PR34326).

Landing on behalf of Nirav Dave to unblock the 5.0.0 release.

Differential Revision: https://reviews.llvm.org/D37220

llvm-svn: 312022
2017-08-29 18:41:00 +00:00
Sanjay Patel 674d2c23ea [Instruction] add moveAfter() convenience function; NFCI
As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(),
but implemented using moveBefore() for symmetry/efficiency.

Differential Revision: https://reviews.llvm.org/D37239

llvm-svn: 312001
2017-08-29 14:07:48 +00:00
Bob Haarman 0bf0d66682 Revert "[codeview] support more DW_OPs for more complete debug info"
This reverts commit e160912f53f047bc97e572add179e08e33f4df48.

llvm-svn: 311977
2017-08-29 04:08:31 +00:00
Bob Haarman 858d098383 Revert "[codeview] don't try to emit variable locations without registers"
This reverts commit a256fbcacf448ee793d23552c46ed2971bf9eff5.

llvm-svn: 311976
2017-08-29 04:08:16 +00:00
Bob Haarman a8d0d1ab91 [codeview] don't try to emit variable locations without registers
This fixes a problem introduced 311957, where the compiler would crash
with "fatal error: error in backend: unknown codeview register".

llvm-svn: 311969
2017-08-29 01:45:54 +00:00
Bob Haarman b2a04a1513 [codeview] support more DW_OPs for more complete debug info
Summary:
Some variables show up in Visual Studio as "optimized out" even in -O0
-Od builds. This change fixes two issues that would cause this to
happen. The first issue is that not all DIExpressions we generate were
recognized by the CodeView writer. This has been addressed by adding
support for DW_OP_constu, DW_OP_minus, and DW_OP_plus. The second
issue is that we had no way to encode DW_OP_deref in CodeView. We get
around that by changinge the type we encode in the debug info to be
a reference to the type in the source code.

This fixes PR34261.

Reviewers: aprantl, rnk, zturner

Reviewed By: rnk

Subscribers: mgorny, llvm-commits, aprantl, hiraditya

Differential Revision: https://reviews.llvm.org/D36907

llvm-svn: 311957
2017-08-29 00:06:59 +00:00
Adrian Prantl 4cae108561 Fix a logic error in DwarfExpression::addMachineReg()
This fixes PR34323 and thus splitting undescribable registers into
smaller, describable sub-registers.

https://bugs.llvm.org/show_bug.cgi?id=34323

llvm-svn: 311951
2017-08-28 23:07:43 +00:00
Zachary Turner a7b041748d [CodeView] Don't output S_UDT symbols for forward decls.
S_UDT symbols are the debugger's "index" for all the structs,
typedefs, classes, and enums in a program.  If any of those
structs/classes don't have a complete declaration, or if there
is a typedef to something that doesn't have a complete definition,
then emitting the S_UDT is unhelpful because it doesn't give
the debugger enough information to do anything useful.  On the
other hand, it results in a huge size blow-up in the resulting
PDB, which is exacerbated by an order of magnitude when linking
with /DEBUG:FASTLINK.

With this patch, we drop S_UDT records for types that refer either
directly or indirectly (e.g. through a typedef, pointer, etc) to
a class/struct/union/enum without a complete definition.  This
brings us about 50% of the way towards parity with /DEBUG:FASTLINK
PDBs generated from cl-compiled object files.

Differential Revision: https://reviews.llvm.org/D37162

llvm-svn: 311904
2017-08-28 18:49:04 +00:00
Craig Topper 029a21dfdc [DAGCombiner] Teach visitEXTRACT_SUBVECTOR to turn extracts of BUILD_VECTOR into smaller BUILD_VECTORs
Only do this before operations are legalized of BUILD_VECTOR is Legal for the target.

Differential Revision: https://reviews.llvm.org/D37186

llvm-svn: 311892
2017-08-28 15:28:33 +00:00
NAKAMURA Takumi a1e97a77f5 Untabify.
llvm-svn: 311875
2017-08-28 06:47:47 +00:00
Sanjay Patel a7a61d9768 [DAGCombiner] allow undef shuffle operands when eliminating bitcasts (PR34111)
As noted in the FIXME, this could be improved more, but this is the smallest fix
that helps:
https://bugs.llvm.org/show_bug.cgi?id=34111

llvm-svn: 311853
2017-08-27 17:29:30 +00:00
Jatin Bhateja e4ca95d6aa [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.

This will also fix PR 33784

Reviewers: zvi, delena, RKSimon, thakis

Reviewed By: RKSimon

Subscribers: chandlerc, eladcohen, llvm-commits

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311833
2017-08-26 19:02:36 +00:00
Jatin Bhateja b60cfbefac Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311832
2017-08-26 19:02:17 +00:00
Hiroshi Yamauchi 63e17ebf8b Add options to dump block frequency/branch probability info in text.
Summary:
Add options -print-bfi/-print-bpi that dump block frequency and branch
probability info like -view-block-freq-propagation-dags and
-view-machine-block-freq-propagation-dags do but in text.

This is useful when the graph is very large and complex (the dot command
crashes, lines/edges too close to tell apart, hard to navigate without textual
search) or simply when text is preferred.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37165

llvm-svn: 311822
2017-08-26 00:31:00 +00:00
David Blaikie 196f53b295 Fix unused-lambda-capture warning by using default capture-by-ref
Since the lambda isn't escaped (via a std::function or similar) it's
fine/better to use default capture-by-ref to provide semantics similar
to language-level nested scopes (if/for/while/etc).

llvm-svn: 311782
2017-08-25 16:46:07 +00:00
Matt Morehouse 355f6f7444 Fix buildbot breakage from r311763. Remove unused lambda capture.
llvm-svn: 311781
2017-08-25 16:19:26 +00:00
Aditya Nandakumar 892979effc [GISel]: Implement widenScalar for Legalizing G_PHI
https://reviews.llvm.org/D37018

llvm-svn: 311763
2017-08-25 04:57:27 +00:00
Sanjay Patel e404cbff66 [DAG] convert vector select-of-constants to logic/math
This goes back to a discussion about IR canonicalization. We'd like to preserve and convert
more IR to 'select' than we currently do because that's likely the best choice in IR:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html
...but that's often not true for codegen, so we need to account for this pattern coming in
to the backend and transform it to better DAG ops.

Steps in this patch:

  1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely
     enable this transform. Other targets will probably want that anyway to distinguish scalars
     from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary.

  2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the
     vector ext is often free.

Implementing a more general fold using xor+and can be a follow-up for targets that don't have
a legal vselect. It's also possible that we can remove the TLI hook for the special case fold
implemented here because we're eliminating a constant, but it needs to be tested on other
targets.

Differential Revision: https://reviews.llvm.org/D36840

llvm-svn: 311731
2017-08-24 23:24:43 +00:00
Eugene Zelenko 5df3d89009 [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311703
2017-08-24 21:21:39 +00:00
Victor Leschuk 6aedf785c5 Remove duplicate code
llvm-svn: 311675
2017-08-24 17:02:38 +00:00
Victor Leschuk 471579b52e Add missing break in switch
llvm-svn: 311673
2017-08-24 16:57:10 +00:00
Matt Arsenault d664315ae8 IPRA: Don't assume called function is first call operand
Fixes not finding the called global for AMDGPU
call pseudoinstructions, which prevented IPRA
from doing much.

llvm-svn: 311637
2017-08-24 07:55:15 +00:00
Matt Arsenault 00459e4a06 IPRA: Exit early on functions without calls
llvm-svn: 311636
2017-08-24 07:55:13 +00:00
Wei Ding a131d3fb29 Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Differential Revision: http://reviews.llvm.org/D36335

llvm-svn: 311629
2017-08-24 04:18:24 +00:00
Hans Wennborg c39ec95d88 [DAG] Fix Node Replacement in PromoteIntBinOp
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.

Fixes PR34137.

Landing on behalf of Nirav.

Differential Revision: https://reviews.llvm.org/D36581

llvm-svn: 311623
2017-08-24 01:08:27 +00:00
Adrian Prantl 7db6b5e2b3 Retire the llvm.dbg.mir hack after r311594.
llvm-svn: 311610
2017-08-23 22:02:36 +00:00
Aditya Nandakumar efd8a84cd5 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Reid Kleckner 950567aac4 Attempt to fix the BUILD_SHARED_LIBS build after the DIExpression change
llvm-svn: 311595
2017-08-23 20:39:35 +00:00
Reid Kleckner 6d353348e5 Parse and print DIExpressions inline to ease IR and MIR testing
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.

This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.

See also PR22780, for making DIExpression not be an MDNode.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37075

llvm-svn: 311594
2017-08-23 20:31:27 +00:00
Lei Huang 0cb591fc4c Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass.  Branch coalescing utilizes
the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets.

Differential Revision : https: // reviews.llvm.org/D32776

llvm-svn: 311588
2017-08-23 19:25:04 +00:00
Dean Michael Berris 0884b73220 [XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text
Summary:
This change achieves two things:

  - Redefine the Custom Event handling instrumentation points emitted by
    the compiler to not require dynamic relocation of references to the
    __xray_CustomEvent trampoline.

  - Remove the synthetic reference we emit at the end of a function that
    we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER
    associated with the section where the function is defined.

To achieve the custom event handling change, we've had to introduce the
concept of sled versioning -- this will need to be supported by the
runtime to allow us to understand how to turn on/off the new version of
the custom event handling sleds. That change has to land first before we
change the way we write the sleds.

To remove the synthetic reference, we rely on a relatively new linker
feature that preserves the sections that are associated with each other.
This allows us to limit the effects on the .text section of ELF
binaries.

Because we're still using absolute references that are resolved at
runtime for the instrumentation map (and function index) maps, we mark
these sections write-able. In the future we can re-define the entries in
the map to use relative relocations instead that can be statically
determined by the linker. That change will be a bit more invasive so we
defer this for later.

Depends on D36816.

Reviewers: dblaikie, echristo, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36615

llvm-svn: 311525
2017-08-23 04:49:41 +00:00
Matthias Braun 8426d1342d Add test case for r311511
This also changes the TailDuplicator to be configured explicitely
pre/post regalloc rather than relying on the isSSA() flag. This was
necessary to have `llc -run-pass` work reliably.

llvm-svn: 311520
2017-08-23 03:17:59 +00:00
Matthias Braun 55bc9b3f9e TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Craig Topper 35189d5221 [SelectionDAG] Make ISD::isConstantSplatVector always return an element sized APInt.
This partially reverts r311429 in favor of making ISD::isConstantSplatVector do something not confusing. Turns out the only other user of it was also having to deal with the weird property of it returning a smaller size.

So rather than continue to deal with this quirk everywhere, just make the interface do something sane.

Differential Revision: https://reviews.llvm.org/D37039

llvm-svn: 311510
2017-08-22 23:54:13 +00:00
Jonas Devlieghere a680a8f5f8 [Debug info] Add new DbgValues after looping over DAG
I was contacted by Jesper Antonsson from Ericsson who ran into problems
with r311181 in their test suites with for an out-of-tree target.
Because of the latter I don't have a reproducer, but we definitely don't
want to modify the data structure on which we are iterating inside the
loop.

llvm-svn: 311466
2017-08-22 16:28:07 +00:00
Renato Golin c070c73d5e [ARM] Avoid creating duplicate ANDs in SelectionDAG
When expanding a BRCOND into a BR_CC, do not create an AND 1
if one already exists.

Review: D36705

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311447
2017-08-22 11:02:45 +00:00
Sjoerd Meijer e0c933f5d6 [SelectionDAG] Add getNode debug messages
This adds debug messages to various functions that create new SDValue nodes.
This is e.g. useful to have during legalization, as otherwise it can prints
legalization info of nodes that did not appear in the dumps before.

Differential Revision: https://reviews.llvm.org/D36984

llvm-svn: 311444
2017-08-22 10:43:51 +00:00
Craig Topper b49f0893b2 [X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.

This patch just adds a flag to prevent the APInt from changing width.

Fixes PR34271.

Differential Revision: https://reviews.llvm.org/D36996

llvm-svn: 311429
2017-08-22 05:40:17 +00:00
Quentin Colombet 4056e80719 [RegAlloc] Make sure live-ranges reflect the state of the IR when removing them
When removing a live-range we used to not touch them making debug
prints harder to read because the IR was not matching what the
live-ranges information was saying.

This only affects debug printing and allows to put stronger asserts in
the code (see r308906 for instance).

llvm-svn: 311401
2017-08-21 22:56:18 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Jatin Bhateja 6b4c205685 [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311255
2017-08-19 18:08:59 +00:00
Jatin Bhateja 66f7958e91 Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311252
2017-08-19 17:59:58 +00:00
Jatin Bhateja 6f0d0d23b0 Merge branch 'arcpatch-D35788'
llvm-svn: 311247
2017-08-19 17:00:04 +00:00
Jatin Bhateja 1c56863739 Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

llvm-svn: 311246
2017-08-19 16:40:06 +00:00
Jatin Bhateja 313f97dd84 Extension of shuffle vector pattern detection, updating post rebase.
llvm-svn: 311242
2017-08-19 15:58:36 +00:00
Adrian Prantl 2116dd360a Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

llvm-svn: 311217
2017-08-19 01:15:06 +00:00
Jonas Devlieghere e101b07a1d [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

(re-commit)

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311181
2017-08-18 18:07:00 +00:00
Craig Topper e3edd9c9be [DAGCombiner] Fix bad comment that had immediate values swapped from the code and what they need to be to make sense. NFC
llvm-svn: 311144
2017-08-18 04:52:46 +00:00
Geoff Berry bd47e8a4f7 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

llvm-svn: 311142
2017-08-18 01:43:11 +00:00
Richard Smith c0541dfa3e Increase tail dup threshold for -O3 from 3 to 4.
We see a modest performance improvement from this slightly higher tail dup threshold.

Differential Revision: https://reviews.llvm.org/D36775

llvm-svn: 311139
2017-08-17 23:38:41 +00:00
Geoff Berry 51f52c4fca Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311135
2017-08-17 23:06:55 +00:00
Eugene Zelenko 6e07bfd0d9 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311124
2017-08-17 21:26:39 +00:00
Jonas Devlieghere 30756da212 Revert "[Debug info] Transfer DI to fragment expressions for split integer values."
This reverts commit r311102.

llvm-svn: 311111
2017-08-17 17:58:33 +00:00
Jonas Devlieghere 622fedc001 [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311102
2017-08-17 17:06:48 +00:00
Adrian Prantl 6a57daad81 Improve line debug info when translating a CaseBlock to SDNodes.
The SelectionDAGBuilder translates various conditional branches into
CaseBlocks which are then translated into SDNodes. If a conditional
branch results in multiple CaseBlocks only the first CaseBlock is
translated into SDNodes immediately, the rest of the CaseBlocks are
put in a queue and processed when all LLVM IR instructions in the
basic block have been processed.

When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder
is queried for the current LLVM IR instruction and the resulting
SDNodes are annotated with the debug info of the current
instruction (if it exists and has debug metadata).

When the deferred CaseBlocks are processed, the SelectionDAGBuilder
does not have a current LLVM IR instruction, and the resulting SDNodes
will not have any debuginfo. As DwarfDebug::beginInstruction() outputs
a .loc directive for the first instruction in a labeled
block (typically the case for something coming from a CaseBlock) this
tends to produce a line-0 directive.

This patch changes the handling of CaseBlocks to store the current
instruction's debug info into the CaseBlock when it is created (and the
SelectionDAGBuilder knows the current instruction) and to always use
the stored debug info when translating a CaseBlock to SDNodes.

Patch by Frej Drejhammar!

Differential Revision: https://reviews.llvm.org/D36671

llvm-svn: 311097
2017-08-17 16:57:13 +00:00
Simon Pilgrim 8be9f4af4f [DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c
llvm-svn: 311083
2017-08-17 13:03:34 +00:00
Jonas Paulsson 57a705d9d0 [SystemZ, MachineScheduler] Improve post-RA scheduling.
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

llvm-svn: 311072
2017-08-17 08:33:44 +00:00
Elad Cohen 124d32829c [SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651

llvm-svn: 311071
2017-08-17 08:06:36 +00:00
Serguei Katkov 9e5604dbe1 [CGP] Fix the rematerialization of gc.relocates
If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.

Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.

The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.

Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462

llvm-svn: 311067
2017-08-17 05:48:30 +00:00
Geoff Berry 4e38e02e6f Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

llvm-svn: 311062
2017-08-17 04:04:11 +00:00
Geoff Berry 87f8d25150 [MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311038
2017-08-16 20:50:01 +00:00
Balaram Makam c5698befb6 Revert "MachineInstr: Reason locally about some memory objects before going to AA."
r310825 caused the clang-ppc64le-linux-lnt bot to go red
(http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712)
because of a test-suite failure of
SingleSource/UnitTests/2003-07-09-SignedArgs

This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996.

llvm-svn: 311008
2017-08-16 14:17:43 +00:00
Quentin Colombet 647b482c1e [VirtRegRewriter] Properly model the register liveness on undef subreg definition
Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.

The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.

E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1

Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
                                                      ; is believed to come
                                                      ; from the previous
                                                      ; instruction

Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107

Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
                                                      ; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>

The bug has been here forever, it got exposed recently because the register
scavenger got smarter.

Fixes llvm.org/PR34107

llvm-svn: 310979
2017-08-16 00:17:05 +00:00
Jessica Paquette 95c1107f4c [MachineOutliner] Only outline candidates of length >= 2
Since we don't factor in instruction lengths into outlining calculations
right now, it's never the case that a candidate could have length < 2.

Thus, we should quit early when we see such candidates.

llvm-svn: 310894
2017-08-14 22:57:41 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Matt Arsenault 81da0d45f8 IPRA: Allow target to enable IPRA by default
llvm-svn: 310876
2017-08-14 19:54:47 +00:00
Matt Arsenault f9273c81d6 IPRA: Run RegUsageInfoPropagate much later
This was running immediately after isel, before
isel pseudos were even expanded which is really
unreasonable. Move this to before pre-reglloc
passes in case some other pre-regalloc pass wants to
use the updated regmask info.

Fixes one of the reasons IPRA doesn't do anything on
AMDGPU currently. Tests will be included with future
patch after a few more are fixed.

llvm-svn: 310875
2017-08-14 19:54:45 +00:00
Amaury Sechet 9c529b6be3 [DAGCombine] Do not try to deduplicate commutative operations if both operand are the same.
Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33840

llvm-svn: 310832
2017-08-14 11:44:03 +00:00
Elad Cohen 6a9edda356 [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))
into vextract(vNiX,Idx) when creating vextract with getNode().
This case appeared in AVX512 after fixing pr33349 in r310552.

Differential revision: https://reviews.llvm.org/D36571

llvm-svn: 310828
2017-08-14 10:49:45 +00:00
Balaram Makam d9f53414de MachineInstr: Reason locally about some memory objects before going to AA.
This addresses a FIXME in MachineInstr::mayAlias.

llvm-svn: 310825
2017-08-14 09:41:40 +00:00
Elad Cohen 3a90a0c10d Revert "[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)"
This reverts commit r310782.

llvm-svn: 310822
2017-08-14 09:06:00 +00:00
Craig Topper 2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Simon Pilgrim 5a86f0e717 [DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Reapplied with fix to only work with simple value types.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 310782
2017-08-12 17:43:25 +00:00
Sanjay Patel 169dae70a6 [x86] use more shift or LEA for select-of-constants (2nd try)
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.

We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the 
   push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a 
   post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
   that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310717
2017-08-11 15:44:14 +00:00
Nirav Dave 0a48e5d506 Improve handling of insert_subvector of bitcast values
Fix insert_subvector / extract_subvector merges of bitcast values.

Reviewers: efriedma, craig.topper, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D34571

llvm-svn: 310711
2017-08-11 13:21:41 +00:00
Nirav Dave d1b3f09faa [X86][DAG] Switch X86 Target to post-legalized store merge
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.

Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.

Reviewers: craig.topper, efriedma, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34559

llvm-svn: 310710
2017-08-11 13:21:35 +00:00
Simon Pilgrim 83cf3a29b5 [DAGCombiner] Remove shuffle support from simplifyShuffleMask
rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs.

Removing support until I can triage the problem

llvm-svn: 310699
2017-08-11 08:37:00 +00:00
Mikael Holmen 8b10680922 [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*
Summary:
This fixes PR32721 in IfConvertTriangle and possible similar problems in
IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond.

In PR32721 we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

The edge updating code and branch probability updating code is now pushed
into MergeBlocks() which allows us to share the same update logic between
more callsites. This lets us remove several dependencies on analyzeBranch
and completely eliminate RemoveExtraEdges.

One thing that showed up with this patch was that IfConversion sometimes
left a successor with 0% probability even if there was no branch or
fallthrough to the successor.

One such example from the test case ifcvt_bad_zero_prob_succ.mir. The
indirect branch tBRIND can only jump to bb.1, but without the patch we
got:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000), %bb.2(0x00000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

  bb.2:

There is no way to jump from bb.1 to bb2, but still there is a 0% edge
from bb.1 to bb.2.

With the patch applied we instead get the expected:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

Since bb.2 had no predecessor at all, it was removed.

Several testcases had to be updated due to this since the removed
successor made the "Branch Probability Basic Block Placement" pass
sometimes place blocks in a different order.

Finally added a couple of new test cases:

* PR32721_ifcvt_triangle_unanalyzable.mir:
  Regression test for the original problem dexcribed in PR 32721.

* ifcvt_triangleWoCvtToNextEdge.mir:
  Regression test for problem that caused a revert of my first attempt
  to solve PR 32721.

* ifcvt_simple_bad_zero_prob_succ.mir:
  Test case showing the problem where a wrong successor with 0% probability
  was previously left.

* ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir
  Very simple test cases for the simple and (forked) diamond cases
  involving unanalyzable branches that can be nice to have as a base if
  wanting to write more complicated tests.

Reviewers: iteratee, MatzeB, grosser, kparzysz

Reviewed By: kparzysz

Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34099

llvm-svn: 310697
2017-08-11 06:57:08 +00:00
Nirav Dave f8556ad48f Revert "[DAG] Cleanup unused nodes after store merge. NFCI."
This reverts commit r310648 which causes an unexpected assertion failure

llvm-svn: 310659
2017-08-10 21:03:36 +00:00
Nirav Dave 4d28c0ff4f [DAG] Relax type restriction for store merge
Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary.

Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34569

llvm-svn: 310655
2017-08-10 19:52:45 +00:00
Nirav Dave 99d9d24553 [DAG] Cleanup unused nodes after store merge. NFCI.
llvm-svn: 310648
2017-08-10 18:53:14 +00:00
Taewook Oh f5040b9685 Make .file directive to have basename only
Summary:
Currently LLVM puts directory along with the filename in .file directive, but this behavior doesn't match gcc. There's a no clear description about which one is right (https://sourceware.org/binutils/docs/as/File.html#File), but one document (https://sourceware.org/gdb/current/onlinedocs/stabs/ELF-Linker-Relocation.html) suggests that STT_FILE symbol in elf file is expected to have basename only, which should have a same sting file .file directive according to (https://docs.oracle.com/cd/E26502_01/html/E28388/eoiyg.html).

This also affects badly on the build system that uses hashing, as the directory info could be differnt from developer to developer even when they're working on same file.

Reviewers: pcc, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36018

llvm-svn: 310642
2017-08-10 18:17:11 +00:00
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Nirav Dave 06242a99ce [DAG] Rewrite expression. NFC.
llvm-svn: 310608
2017-08-10 15:29:33 +00:00
Nirav Dave 926e2d39bf [X86] Keep dependencies when constructing loads in combineStore
Summary:
Preserve chain dependecies between old and new loads constructed to
prevent loads from reordering below later stores.

Fixes PR34088.

Reviewers: craig.topper, spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36528

llvm-svn: 310604
2017-08-10 15:12:32 +00:00
Guy Blank 136b543745 [SelectionDAG] Allow constant folding for implicitly truncating BUILD_VECTOR nodes.
In FoldConstantArithmetic, handle BUILD_VECTOR nodes that do implicit truncation on the elements.

This is similar to what is done in FoldConstantVectorArithmetic.

Differential Revision:
https://reviews.llvm.org/D36506

llvm-svn: 310593
2017-08-10 14:09:50 +00:00
Elad Cohen 22ba97a0a6 [SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.

When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.

The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.

This fixes pr33349.

Differential revision: https://reviews.llvm.org/D36511

llvm-svn: 310552
2017-08-10 07:44:23 +00:00
David Blaikie 76fb649b0d Reduce variable scope by moving declaration into if clause
llvm-svn: 310506
2017-08-09 18:34:18 +00:00
Nirav Dave 6110d3ad00 [DAG] Explicitly cleanup merged load values during store merge. NFCI.
llvm-svn: 310474
2017-08-09 13:37:07 +00:00
Serguei Katkov 6ea2e81cf6 [ImplicitNullCheck] Fix the bug when dependent instruction accesses memory
It is possible that dependent instruction may access memory.
In this case we must reject optimization because the memory change will
be visible in null handler basic block. So we will execute an instruction which
we must not execute if check fails.

Reviewers: sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36392

llvm-svn: 310443
2017-08-09 05:17:02 +00:00
Reid Kleckner e2e82061f9 [codeview] Emit nested enums and typedefs from classes
Previously we limited ourselves to only emitting nested classes, but we
need other kinds of types as well.

This fixes the Visual Studio STL visualizers, so that users can
visualize std::string and other objects.

llvm-svn: 310410
2017-08-08 20:30:14 +00:00
Nirav Dave 8a813cf646 [DAG] Introduce peekThroughBitcast function. NFCI.
llvm-svn: 310405
2017-08-08 20:01:18 +00:00
Nirav Dave 515116d7c2 [DAG] Update comments. NFC.
llvm-svn: 310404
2017-08-08 19:52:19 +00:00
Simon Pilgrim 91b7b991d4 [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as well as BUILD_VECTOR
Minor extension to D36393

llvm-svn: 310372
2017-08-08 16:10:33 +00:00
Simon Pilgrim ef44228acb [DAGCombiner] Simplify shuffle mask index if the referenced input element is UNDEF
Fixes one of the cases in PR34041.

Differential Revision: https://reviews.llvm.org/D36393

llvm-svn: 310344
2017-08-08 11:03:30 +00:00
Sanjay Patel 807f92b8ff [x86] revert r310208 to investigate test-suite failures (PR34105 / PR34097)
llvm-svn: 310264
2017-08-07 15:47:48 +00:00
Nirav Dave 3d3bde7682 [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Relanding after case to insert explicit truncation as necessary.

Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 310256
2017-08-07 14:07:49 +00:00
Guy Blank 5ca01695f7 [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocks
The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types.
But before calling CodeGenAndEmitDAG we build the DAG for the basic block.
So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'.

This patch sets the flag to false before SDAG building each basic block.

Differential Revision:
https://reviews.llvm.org/D33435

llvm-svn: 310239
2017-08-07 05:51:14 +00:00
Sanjay Patel a923c2ee95 [x86] use more shift or LEA for select-of-constants
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies 
(they always become shl/lea) to avoid cmov or branching. The current code misses 
cases where we have a negative constant and a positive constant, so this is just 
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start 
with a select in IR, create a select DAG node, convert it into a sext, convert it 
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. I 
   think we could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on 
   a different reg? That's a post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if 
   that's a regression, but I think those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so I don't think we actually care about these diffs.
5. sbb.ll - This shows a win for what I think is a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310208
2017-08-06 16:27:07 +00:00
Matt Arsenault b94972cb82 IPRA: Don't crash on null getCallPreservedMask
Kernels aren't callable, so they don't have a call preserved mask.

llvm-svn: 310172
2017-08-05 07:50:18 +00:00
Kyle Butt 74f61dd8ef BlockPlacement: add a flag to force cold block outlining w/o a profile.
NFC.

llvm-svn: 310129
2017-08-04 21:13:41 +00:00
Nico Weber b24df62bb6 Revert r310058, it caused PR34073.
llvm-svn: 310118
2017-08-04 20:24:13 +00:00
Quentin Colombet c7652e22d7 [GlobalISel] Remove a stall comment in CMake.
Thanks to Diana Picus <diana.picus@linaro.org> for noticing.

NFC

llvm-svn: 310114
2017-08-04 20:15:41 +00:00
Marcello Maggioni 8de4bbdaa5 [MachineOperand] Add ChangeToTargetIndex method. NFC
Differential Revision: https://reviews.llvm.org/D36301

llvm-svn: 310083
2017-08-04 18:24:09 +00:00
Simon Pilgrim 5c63586489 [DAGCombiner] Extending pattern detection for vector shuffle.
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 310058
2017-08-04 12:46:35 +00:00
Eric Christopher 52854dcd34 Fix typo.
llvm-svn: 309997
2017-08-03 22:41:12 +00:00
Matt Arsenault a52391f2db DAG: Provide access to Pass instance from SelectionDAG
This allows accessing an analysis pass during lowering.

llvm-svn: 309991
2017-08-03 21:54:00 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Nirav Dave 3fc1c2365c [DAG] Allow merging of stores of vector loads
Remove restriction disallowing merging of stores vector loads into
larger store of larger vector load.

Reviewers: RKSimon, efriedma, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36158

llvm-svn: 309951
2017-08-03 15:51:20 +00:00
Robert Lougher 10f740df4d [LiveDebugVariables] Use lexical scope to trim debug value live intervals
The debug value live intervals computed by Live Debug Variables may extend
beyond the range of the debug location's lexical scope. In this case,
splitting of an interval can result in an interval outside of the scope being
created, causing extra unnecessary DBG_VALUEs to be emitted. To prevent this,
trim the intervals to the lexical scope.

This resolves PR33730.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D35953

llvm-svn: 309933
2017-08-03 11:54:02 +00:00
Simon Dardis 51296593a8 [SelectionDAG] Resolve PR33978.
rL306209 taught SelectionDAG how to add the dereferenceable flag when
expanding memcpy and memmove. The fix however contained a nit where
the offset + size was constructed as an APInt of PointerSize rather
than PointerSizeInBits.

This lead to isDereferenceableAndAlignedPointer() get truncated values or
values which would be sign extended within that function leading to
incorrect results.

Thanks to Alex Crichton for reporting the issue!

This resolves PR33978.

Reviewers: inouehrs

Differential Revision: https://reviews.llvm.org/D36236

llvm-svn: 309930
2017-08-03 09:38:46 +00:00
Sameer AbuAsal c8befe687f [RegisterCoalescer] Add wrapper for Erasing Instructions
Summary:
      To delete an instruction the coalescer needs to call eraseFromParent()
      on the MachineInstr, insert it in the ErasedInstrs list and update the
      Live Ranges structure. This patch re-factors the code to do all that in
      one function. This will also fix cases where previous code wasn't
      inserting deleted instructions in the ErasedList.

Reviewers: qcolombet, kparzysz

Reviewed By: qcolombet

Subscribers: MatzeB, llvm-commits, qcolombet

Differential Revision: https://reviews.llvm.org/D36204

llvm-svn: 309915
2017-08-03 02:41:17 +00:00
Rafael Espindola 79e238afee Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Hiroshi Inoue 0bd906ec8f [StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)
This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments.

Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp

// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.

But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.

This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.

This patch fixes PR33928.

llvm-svn: 309849
2017-08-02 18:16:32 +00:00
Adrian Prantl bd6d291c59 Assert that the offset of a DBG_VALUE is always 0. (NFC)
llvm-svn: 309834
2017-08-02 17:19:13 +00:00
Adrian Prantl 5577363a2c Remove the unused Offset field from MachineLocation (NFC)
rdar://problem/33580047

llvm-svn: 309831
2017-08-02 17:07:38 +00:00
Nirav Dave e44032f7e7 [DAG] Improve candidate pruning in store merge failure case. NFCI
During store merge we construct a sorted list of consecutive store
candidates and consider subsequences for merging into a single
store. For each subsequence we check if the stored value type is legal
the merged store would have valid and fast and if the constructed
value to be stored is valid. The only properties that affect this
check between subsequences is the size of the subsequence, the
alignment of the first store, the alignment of the stored load value
(when merging stores-of-loads), and whether the merged value is a
constant zero.

If we do not find a viable mergeable subsequence starting from the
first store of length N, we know that a subsequence starting at a
later store of length N will also fail unless the new store's
alignment, the new load's alignment (if we're merging store-of-loads),
or we've dropped stores of nonzero value and could construct a merged
stores of zero (for merging constants).

As a result if we fail to find a valid subsequence starting from the
first store we can safely skip considering subsequences that start
with subsequent stores unless one of the above properties is
true. This significantly (2x) improves compile time in some
pathological cases.

Reviewers: RKSimon, efriedma, zvi, spatel, waltl

Subscribers: grandinj, llvm-commits

Differential Revision: https://reviews.llvm.org/D35901

llvm-svn: 309830
2017-08-02 16:35:58 +00:00
Adrian Prantl 61b39b2aec Remove unused includes of MachineLocation.h (NFC)
llvm-svn: 309824
2017-08-02 15:32:18 +00:00
Adrian Prantl 2049c0d734 Remove unreachable code. (NFC)
MachineLocation::getOffset() always returns 0.

rdar://problem/33580047

llvm-svn: 309823
2017-08-02 15:22:17 +00:00
Diana Picus d5a00b0ff6 [MIR] Print target-specific constant pools
This should enable us to test the generation of target-specific constant
pools, e.g. for ARM:

constants:
 - id:              0
   value:           'g(GOT_PREL)-(LPC0+8-.)'
   alignment:       4
   isTargetSpecific: true

I intend to use this to test PIC support in GlobalISel for ARM.

This is difficult to test outside of that context, since the existing
MIR tests usually rely on parser support as well, and that seems a bit
trickier to add. We could try to add a unit test, but the setup for that
seems rather convoluted and overkill.

We do test however that the parser reports a nice error when
encountering a target-specific constant pool.

Differential Revision: https://reviews.llvm.org/D36092

llvm-svn: 309806
2017-08-02 11:09:30 +00:00
Nirav Dave 15894f8dec [DAG] Refactor store merge subexpressions. NFC.
Distribute various expressions across ifs.

llvm-svn: 309777
2017-08-02 01:08:38 +00:00
Matt Arsenault acc5e82b0e DAG: Undo and->or combine with FrameIndexes
This pattern shows up when lowering byval copies on AMDGPU.

The byval object access is split into 4-byte chunks, adding a
constant offset to the FixedStack base. When some of the offsets
turn into ors, this prevents combining the constant offsets.

This makes it not apparent that the object is there when matching
addressing modes, so it ends up using a scratch wave offset
relative access and the lengthy frame index expansion for that.

llvm-svn: 309775
2017-08-02 00:43:42 +00:00
Adrian Prantl 83ca4fc7bc Update LiveDebugValues to generate DIExpressions for spill offsets
instead of using the deprecated offset field of DBG_VALUE.

This has no observable effect on the generated DWARF, but the
assembler comments will look different.

rdar://problem/33580047

llvm-svn: 309773
2017-08-02 00:16:56 +00:00
Adrian Prantl aac78ce47e Use helper function instead of manually constructing DBG_VALUEs (NFC)
rdar://problem/33580047

llvm-svn: 309757
2017-08-01 22:37:35 +00:00
Adrian Prantl 032d2381bf Remove PrologEpilogInserter's usage of DBG_VALUE's offset field
In the last half-dozen commits to LLVM I removed code that became dead
after removing the offset parameter from llvm.dbg.value gradually
proceeding from IR towards the backend. Before I can move on to
DwarfDebug and friends there is one last side-called offset I need to
remove:  This patch modifies PrologEpilogInserter's use of the
DBG_VALUE's offset argument to use a DIExpression instead. Because the
PrologEpilogInserter runs at the Machine level I had to play a little
trick with a named llvm.dbg.mir node to get the DIExpressions to print
in MIR dumps (which print the llvm::Module followed by the
MachineFunction dump).

I also had to add rudimentary DwarfExpression support to CodeView and
as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo
was supposed to give up on variables with complex DIExpressions, but
would fail to do so for fragments, which are also modeled as
DIExpressions).

With this last holdover removed we will have only one canonical way of
representing offsets to debug locations which will simplify the code
in DwarfDebug (and future versions of CodeViewDebug once it starts
handling more complex expressions) and make it easier to reason about.

This patch is NFC-ish: All test case changes are for assembler
comments and the binary output does not change.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D36125

llvm-svn: 309751
2017-08-01 21:45:24 +00:00
Nirav Dave dcc5afaad9 [DAG] Factor out common expressions. NFC.
llvm-svn: 309740
2017-08-01 20:30:52 +00:00
Reid Kleckner 29c675247d [DebugInfo] Don't turn dbg.declare into DBG_VALUE for static allocas
Summary:
We already have information about static alloca stack locations in our
side table. Emitting instructions for them is inefficient, and it only
happens when the address of the alloca has been materialized within the
current block, which isn't often.

Reviewers: aprantl, probinson, dblaikie

Subscribers: jfb, dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D36117

llvm-svn: 309729
2017-08-01 19:45:09 +00:00
Nirav Dave 35dd1ac29c Pull out VectorNumElements value. NFC.
llvm-svn: 309719
2017-08-01 18:19:56 +00:00
Nirav Dave 27a6605bdc Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."
This reverts commit r309680 which appears to be raising an assertion
in the test-suite.

llvm-svn: 309717
2017-08-01 18:09:25 +00:00
Sanjay Patel 4dbdd470a8 [CGP] use narrower types in memcmp expansion when possible
This only affects very small memcmp on x86 for now, but it
will become more important if we allow vector-sized load and
compares.

llvm-svn: 309711
2017-08-01 17:24:54 +00:00
Nirav Dave f54c8370e5 [DAG] Convert extload check to equivalent type check. NFC.
Replace check with check that consuming store has the same type.

llvm-svn: 309708
2017-08-01 17:19:41 +00:00
Nirav Dave b5a0af6b6e [DAG] Move extload check in store merge. NFC.
Move candidate check from later check to initial candidate check.

llvm-svn: 309698
2017-08-01 16:00:47 +00:00
Manoj Gupta 7ebbe655ed [X86] Fix a crash in FEntryInserter Pass.
Summary:
FEntryInserter pass unconditionally derefs the first Instruction
in the first Basic Block. The pass crashes when the first
BasicBlock is empty. Fix the crash by not dereferencing the basic
Block iterator. This fixes an issue observed when building Linux kernel
4.4 with clang.

Fixes PR33971.

Reviewers: hfinkel, niravd, dblaikie

Reviewed By: niravd

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D35979

llvm-svn: 309694
2017-08-01 15:39:12 +00:00
David Blaikie c2be863681 DebugInfo: Update flag description that'd been copypasted from another
Post-commit review feedback from Paul Robinson on r309630. Thanks Paul!

llvm-svn: 309685
2017-08-01 14:50:50 +00:00
Nirav Dave b5cb48c6ae [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Summary:
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 309680
2017-08-01 13:45:35 +00:00
Andrew V. Tischenko d56595184b Support itineraries in TargetSubtargetInfo::getSchedInfoStr - Now if the given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries.
Differential Revision https://reviews.llvm.org/D35997

llvm-svn: 309666
2017-08-01 09:15:43 +00:00
Hiroshi Inoue b9417dbd48 [StackColoring] Update AliasAnalysis information in stack coloring pass
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp

// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.

But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.

This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.

llvm-svn: 309651
2017-08-01 03:32:15 +00:00
Eli Friedman 37f41d1e7c [ScheduleDAG] Don't schedule node with physical register interference
https://reviews.llvm.org/D31536 didn't really solve the problem it was
trying to solve; it got rid of the assertion failure, but we were still
scheduling the DAG incorrectly (mixing together instructions from
different calls), leading to a MachineVerifier failure.

In order to schedule the DAG correctly, we have to make sure we don't
schedule a node which should be blocked by an interference. Fix
ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node
like that.

The added call to FindAvailableNode() is the key change here; this makes
sure we don't try to schedule a call while we're in the middle of
scheduling a different call. I'm not sure this is the right approach; in
particular, I'm not sure how to prove we don't end up with an infinite
loop of repeatedly backtracking.

This also reverts the code change from D31536. It doesn't do anything
useful: we should never schedule an ADJCALLSTACKDOWN unless we've
already scheduled the corresponding ADJCALLSTACKUP.

Differential Revision: https://reviews.llvm.org/D33818

llvm-svn: 309642
2017-08-01 00:28:40 +00:00
David Blaikie 038e28a5a7 DebugInfo: Put range base specifier entry functionality behind a flag
Chromium's gold build seems to have trouble with this (gold produces
errors) - not sure if it's gold that's not coping with the valid
representation, or a bug in the implementation in LLVM, etc.

llvm-svn: 309630
2017-07-31 21:48:42 +00:00
Reid Kleckner 2de471dca9 [codeview] Ignore DBG_VALUEs when choosing a BB start source loc
When the first instruction of a basic block has no location (consider a
LEA materializing the address of an alloca for a call), we want to start
the line table for the block with the first valid source location in the
block.  We need to ignore DBG_VALUE instructions during this scan to get
decent line tables.

llvm-svn: 309628
2017-07-31 21:03:08 +00:00
Quentin Colombet 15f6ffbf7c [TargetPassConfig] Feature generic options to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.

NFC.

Differential Revision: https://reviews.llvm.org/D30913

llvm-svn: 309599
2017-07-31 18:24:07 +00:00
Sanjay Patel fea731a4aa [CGP] use subtract or subtract-of-cmps for result of memcmp expansion
As noted in the code comment, transforming this in the other direction might require 
a separate transform here in CGP given the block-at-a-time DAG constraint.

Besides that theoretical motivation, there are 2 practical motivations for the 
subtract-of-cmps form:

1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). 
   There is discussion about canonicalizing IR to the select form 
   ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), 
   so we probably need to add DAG transforms for those patterns anyway, but this improves the 
   memcmp output without waiting for that step.

2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert
   that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided
   if we choose to enable that.

Differential Revision: https://reviews.llvm.org/D34904

llvm-svn: 309597
2017-07-31 18:08:24 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Florian Hahn 1b09a37c07 Extend ifndef to printDebugLoc.
GCC7 did not warn about that, but Clang does.

llvm-svn: 309573
2017-07-31 16:29:00 +00:00
Florian Hahn 94b8a87c8e Extend ifdefs to more unused helper functions.
This fixes a buildbot failure with -Werror introduced by r309553

llvm-svn: 309572
2017-07-31 16:11:43 +00:00
Simon Dardis 8930b4a049 [SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765

llvm-svn: 309561
2017-07-31 14:06:58 +00:00
Florian Hahn 6b3216aad8 Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
David Blaikie 4dd663752d DebugInfo: Fix r309526, ensure resetting base address selection entries are used
Missed the resetting base address selections when going from a base
address version to zero base address for non-base-addressed entries.

llvm-svn: 309529
2017-07-31 00:18:24 +00:00
David Blaikie 89c81a0b91 DebugInfo: Use base address selection entries in debug_ranges to reduce relocations
(from comments in the test)
Group ranges in a range list that apply to the same section and use a base
address selection entry to reduce the number of relocations to one reloc per
section per range list. DWARF5 debug_rnglist will be more efficient than this
in terms of relocations, but it's still better than one reloc per entry in a
range list.

This is an object/executable size tradeoff - shrinking objects, but growing
the linked executable. In one large binary tested, total object size (not just
debug info) shrank by 16%, entirely relocation entries. Linked executable
grew by 4%. This was with compressed debug info in the objects, uncompressed
in the linked executable. Without compression in the objects, the win would be
smaller (the growth of debug_ranges itself would be more significant).

llvm-svn: 309526
2017-07-30 22:10:00 +00:00
Simon Pilgrim 718cb0ea62 [SelectionDAG][X86] CombineBT - more aggressively determine demanded bits
This patch is in 2 parts:

1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.

2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.

Differential Revision: https://reviews.llvm.org/D35896

llvm-svn: 309486
2017-07-29 14:50:25 +00:00
Jessica Paquette d87f54493d [MachineOutliner] NFC: Change IsTailCall to a call class + frame class
This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

llvm-svn: 309475
2017-07-29 02:55:46 +00:00
Adrian Prantl 359846fe1c Remove the unused offset field from LiveDebugValues (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309455
2017-07-28 23:25:51 +00:00
Adrian Prantl b2811e50f9 Remove the unused offset field from LiveDebugVariables (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309451
2017-07-28 23:06:50 +00:00
Adrian Prantl 8b9bb534a1 Remove the unused offset from DBG_VALUE (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309450
2017-07-28 23:00:45 +00:00
Adrian Prantl d92ac5a259 Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309449
2017-07-28 22:46:20 +00:00
Adrian Prantl 1b63dc54b0 Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309446
2017-07-28 22:36:55 +00:00
Adrian Prantl a617576bb1 Remove the unused dbg.value offset from SelectionDAG (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309436
2017-07-28 21:27:35 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Reid Kleckner 9be82c3169 Fix conditional tail call branch folding when both edges are the same
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:

BB#1: derived from LLVM BB %entry
    Live Ins: %EFLAGS
    Predecessors according to CFG: BB#0
        JE_1 <BB#5>, %EFLAGS<imp-use,kill>
        JMP_1 <BB#5>

BB#5: derived from LLVM BB %sw.epilog
    Predecessors according to CFG: BB#1
        TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...

We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.

This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.

Fixes PR33980

llvm-svn: 309422
2017-07-28 19:48:40 +00:00
Jessica Paquette 4602c3437c [MachineOutliner] NFC: Comment tidying
The comment on describing the suffix tree had some pruning
stuff that was out of date in it.

Also fixed some typos.

llvm-svn: 309365
2017-07-28 05:59:30 +00:00
Jessica Paquette 809d708b8a [MachineOutliner] NFC: Split up getOutliningBenefit
This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

llvm-svn: 309356
2017-07-28 03:21:58 +00:00
David Blaikie 89daf77a11 DebugInfo: Consider a CU containing only local imported entities to be 'empty'
This can come up in ThinLTO & wastes space & makes degenerate IR.

As per the added FIXME, ultimately, local imported entities should hang
off the function and that way the imported entity list on the CU can be
tested for emptiness like all the other CU lists.

(function-attached local imported entities are probably also the best
path forward for fixing how imported entities are handled both in
cross-module use (currently, while ThinLTO preserves the imported
entities, they would not get used at the imported inlined location -
only in the abstract origin that appears in the partial CU created by
the import (which isn't emitted under Fission due to cross-CU
limitations there)) and to reduce the number of points where imported
entities are emitted (they're currently emitted into every inlined
instance, concrete instance, and abstract origin - they should only go
in teh abstract origin if there is one, otherwise in the concrete
instance - but this requires lots of delayed handling and wiring up,
same as abstract variables & subprograms))

llvm-svn: 309354
2017-07-28 03:06:25 +00:00
Jessica Paquette 78681be2a4 [MachineOutliner] Cleanup: move findCandidates out of suffix tree
Doing some cleanup in preparation for some functional changes.
This commit moves findCandidates out of the suffix tree and into the
MachineOutliner class. This is much easier to follow, and removes
the burden of candidate choice from the suffix tree.

It also adds a couple FIXMEs and simplifies building outlined function
names.

llvm-svn: 309334
2017-07-27 23:24:43 +00:00
Simon Pilgrim ac84850ea6 [SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)
Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.

llvm-svn: 309302
2017-07-27 18:15:54 +00:00
Adam Nemet 6374331a8c [OptRemark] Allow streaming of 64-bit integers
llvm-svn: 309293
2017-07-27 16:54:13 +00:00
Simon Pilgrim 64a795c5f5 [SelectionDAG] Avoid repeated calls to getNumOperands in for loop. NFCI.
llvm-svn: 309283
2017-07-27 15:42:21 +00:00
Adrian Prantl 960e7663f3 remove redundant check
llvm-svn: 309280
2017-07-27 15:24:20 +00:00
Simon Pilgrim 0d543b5921 [SelectionDAG] Tidyup mask creation. NFCI.
Assign all concat elements to UNDEF and then just replace the first element, instead of copying everything individually.

llvm-svn: 309277
2017-07-27 15:08:53 +00:00
David Blaikie 2195e13676 DebugInfo: Ensure imported entities at the top level of an inlined function don't cause degenerate concrete definitions
Local imported entities at the top level of a subprogram were being
handled differently from those in nested scopes - that different
handling would cause pseudo concrete out-of-line definitions to be
created (but without any of their attributes, nor an abstract_origin) in
the case where there was no real concrete definition.

These local imported entities also only appeared in the concrete
definition where those imported entities in nested scopes appear in all
cases (abstract, concrete, and inlined). This change at least makes top
level case handle the same as the others - though there's a FIXME to
improve this to /only/ emit them into the abstract origin (though this
requires more plumbing - like the abstract subprogram and variable
handling that must defer population until the end of the unit to
discover if there is an abstract origin, or only a standalone concrete
definition).

llvm-svn: 309237
2017-07-27 00:06:53 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Andrew V. Tischenko d1fefa3d7c This patch returns proper value to indicate the case when instruction throughput can't be calculated.
Differential revision https://reviews.llvm.org/D35831

llvm-svn: 309156
2017-07-26 18:55:14 +00:00
Adrian Prantl 833ad37c90 Do a better job at emitting prefrabricated skeleton CUs.
This is a better fix than r308708 for the problem introduced in
r304020. It restores the skeleton CU testcases modified by that commit
to their original form and most importantly ensures that
frontend-generated skeleton CUs (such as used to point to Clang
modules) come after the regular CUs. This broke for DICompileUnit
nodes that don't have any immediate children because they are now
constructed lazily instead of the order in which they are listed in
!llvm.dbg.cu. After this commit we still don't guarantee that order,
but we do guarantee that empty skeletons come last.

Shipping versions of LLDB are very sensitive to the ordering of
CUs. I'll track a fix for LLDB to be more permissive separately.
This fixes a test failure in the LLDB testsuite.

rdar://problem/33357252

llvm-svn: 309154
2017-07-26 18:48:32 +00:00
Zvi Rackover 092f199188 DAGCombiner: Extend reduceBuildVecToTrunc to handle non-zero offset
Summary:
Adding support for combining power2-strided build_vector's where the
first build_vectori's operand is extracted from a non-zero index.

Example:

 v4i32 build_vector((extract_elt V, 1),
                    (extract_elt V, 3),
                    (extract_elt V, 5),
                    (extract_elt V, 7))
 -->
 v4i32 truncate (bitcast (shuffle<1,u,3,u,5,u,7,u> V, u) to v4i64)

Reviewers: delena, RKSimon, guyblank

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35700

llvm-svn: 309108
2017-07-26 12:57:03 +00:00
Adrian Prantl be66271f04 Debug Info: Support fragmented variables in the MMI side table
This reapplies commit r309034 with a bugfix+test for inlined variables.

llvm-svn: 309057
2017-07-25 23:32:59 +00:00
Adrian Prantl b6d5faf2ea Revert "Debug Info: Support fragmented variables in the MMI side table"
This reverts commit r309034 because of a sanitizer issue.

llvm-svn: 309035
2017-07-25 21:50:45 +00:00
Adrian Prantl 3d1ab0cd1e Debug Info: Support fragmented variables in the MMI side table
<rdar://problem/17816343>

llvm-svn: 309034
2017-07-25 21:29:22 +00:00
Simon Pilgrim 6d59933175 [DAG] Move DAGCombiner::GetDemandedBits to SelectionDAG
This patch moves the DAGCombiner::GetDemandedBits function to SelectionDAG::GetDemandedBits as a first step towards making it easier for targets to get to the source of any demanded bits without the limitations of SimplifyDemandedBits.

Differential Revision: https://reviews.llvm.org/D35841

llvm-svn: 308983
2017-07-25 16:36:44 +00:00
Francois Pichet 82bf3de606 Fix endianness bug in DAGCombiner::visitTRUNCATE and visitEXTRACT_VECTOR_ELT
Summary:
Do not assume little endian architecture in DAGCombiner::visitTRUNCATE and DAGCombiner::visitEXTRACT_VECTOR_ELT.
PR33682

Reviewers: hfinkel, sdardis, RKSimon

Reviewed By: sdardis, RKSimon

Subscribers: uabelho, RKSimon, sdardis, llvm-commits

Differential Revision: https://reviews.llvm.org/D34990

llvm-svn: 308960
2017-07-25 09:40:35 +00:00
Matt Arsenault 5fbc87021e RA: Replace asserts related to empty live intervals
These don't exactly assert the same thing anymore, and
allow empty live intervals with non-empty uses.

Removed in r308808 and r308813.

llvm-svn: 308906
2017-07-24 18:07:55 +00:00
Benjamin Kramer fc638c11bb [CodeGenPrepare] Cut off FindAllMemoryUses if there are too many uses.
This avoids excessive compile time. The case I'm looking at is
Function.cpp from an old version of LLVM that still had the giant memcmp
string matcher in it. Before r308322 this compiled in about 2 minutes,
after it, clang takes infinite* time to compile it. With this patch
we're at 5 min, which is still bad but this is a pathological case.

The cut off at 20 uses was chosen by looking at other cut-offs in LLVM
for user scanning. It's probably too high, but does the job and is very
unlikely to regress anything.

Fixes PR33900.

* I'm impatient and aborted after 15 minutes, on the bug report it was
  killed after 2h.

llvm-svn: 308891
2017-07-24 16:18:09 +00:00
Reid Kleckner 898ddf61c0 [codeview] Emit 'D' as the cv source language for D code
This matches DMD:
522263965c/src/ddmd/backend/cv8.c (L199)

Fixes PR33899.

llvm-svn: 308890
2017-07-24 16:16:42 +00:00
Reid Kleckner 7f6b2534fb Format some case labels and shrink an anonymous namespace NFC
llvm-svn: 308889
2017-07-24 16:16:17 +00:00
Petr Hosek 710479cede [CodeGen][X86] Fuchsia supports sincos* libcalls and sin+cos->sincos optimization
Patch by Roland McGrath

Differential Revision: https://reviews.llvm.org/D35748

llvm-svn: 308854
2017-07-23 22:30:00 +00:00
Nirav Dave 4e6dcf73f9 [DAG] Fix typo preventing some stores merges to truncated stores.
Check the actual memory type stored and not the extended value size
when considering if truncated store merge is worthwhile.

Reviewers: efriedma, RKSimon, spatel, jyknight

Reviewed By: efriedma

Subscribers: llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D35623

llvm-svn: 308833
2017-07-23 02:06:28 +00:00
Matt Arsenault c5d1e503e1 RA: Remove another assert on empty intervals
This case is similar to the one fixed in r308808,
except when rematerializing.

Fixes bug 33884.

llvm-svn: 308813
2017-07-22 00:24:01 +00:00
Matt Arsenault 6a963f76ca RA: Remove assert on empty live intervals
This is possible if there is an undef use when
splitting the vreg during spilling.

Fixes bug 33620.

llvm-svn: 308808
2017-07-21 23:56:13 +00:00
Xin Tong 495a3022da [DAGCombiner] Update comment. NFC
llvm-svn: 308772
2017-07-21 19:10:19 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Philipp Schaad a81d23030f Commit access test
llvm-svn: 308712
2017-07-21 03:51:01 +00:00
Adrian Prantl 65e7ca995d Debug Info: Don't strip clang module skeleton CUs.
This corrects a (hopefully :-) accidental side-effect of r304020.

rdar://problem/33442618

llvm-svn: 308708
2017-07-21 01:24:05 +00:00
Tim Northover 071d77a51f GlobalISel: stop localizer putting constants before EH_LABELs
If the localizer pass puts one of its constants before the label that tells the
unwinder "jump here to handle your exception" then control-flow will skip it,
leaving uninitialized registers at runtime. That's bad.

llvm-svn: 308687
2017-07-20 22:58:26 +00:00
Matt Arsenault db78273b6e Add an ID field to StackObjects
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.

This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.

This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.

Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.

llvm-svn: 308673
2017-07-20 21:03:45 +00:00
Francis Visoiu Mistrih 39aa5dbbf5 [PEI] Fix refactoring from r308664
llvm-svn: 308666
2017-07-20 20:31:44 +00:00
Mandeep Singh Grang d41ac895bb [COFF, ARM64, CodeView] Add support to emit CodeView debug info for ARM64 COFF
Reviewers: compnerd, ruiu, rnk, zturner

Reviewed By: rnk

Subscribers: majnemer, aemerson, aprantl, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35518

llvm-svn: 308665
2017-07-20 20:20:00 +00:00
Francis Visoiu Mistrih 631f6b888c [PEI] Separate saving and restoring CSRs into different functions. NFC
Split insertCSRSpillsAndRestores into insertCSRSaves + insertCSRRestores.

This is mostly useful for future shrink-wrapping improvements where we
want to save / restore a specific part of the CSRs in a specific block.

Differential Revision: https://reviews.llvm.org/D35644

llvm-svn: 308664
2017-07-20 20:17:17 +00:00
Krzysztof Parzyszek f3a778d757 Implement LaneBitmask::getNumLanes and LaneBitmask::getHighestLane
This should eliminate most uses of countPopulation and Log2_32 on
the lane mask values.

llvm-svn: 308658
2017-07-20 19:43:19 +00:00
Krzysztof Parzyszek e9f0c1e031 Use LaneBitmask::getLane in a few more places
llvm-svn: 308655
2017-07-20 19:15:56 +00:00
Nirav Dave 4aa51c3af1 [DAG] Commit missed nit cleanup from r308617. NFC.
llvm-svn: 308645
2017-07-20 18:07:57 +00:00
Nirav Dave df86d2d008 [DAG] Handle missing transform in fold of value extension case.
Summary:
When pushing an extension of a constant bitwise operator on a load
into the load, change other uses of the load value if they exist to
prevent the old load from persisting.

Reviewers: spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35030

llvm-svn: 308618
2017-07-20 13:57:32 +00:00
Nirav Dave 77cc6f23b9 [DAG] Optimize away degenerate INSERT_VECTOR_ELT nodes.
Summary:
Add missing vector write of vector read reduction, i.e.:

(insert_vector_elt x (extract_vector_elt x idx) idx) to x

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35563

llvm-svn: 308617
2017-07-20 13:48:17 +00:00
Simon Pilgrim 2911296f10 [DAGCombiner] Match ISD::SRL non-uniform constant vectors patterns using predicates.
Use predicate matchers introduced in D35492 to match more ISD::SRL constant folds

llvm-svn: 308602
2017-07-20 11:03:30 +00:00
Simon Pilgrim b9ff25df59 Remove trailing whitespace. NFCI.
llvm-svn: 308601
2017-07-20 10:43:52 +00:00
Simon Pilgrim 7ff0e49d8c [DAGCombiner] Match ISD::SRA non-uniform constant vectors patterns using predicates.
Use predicate matchers introduced in D35492 to match more ISD::SRA constant folds

llvm-svn: 308600
2017-07-20 10:43:05 +00:00
Simon Pilgrim 9d7863b935 [DAGCombiner] Match non-uniform constant vectors using predicates.
Most combines currently recognise scalar and splat-vector constants, but not non-uniform vector constants.

This patch introduces a matching mechanism that uses predicates to check against BUILD_VECTOR of ConstantSDNode, as well as scalar ConstantSDNode cases.

I've changed a couple of predicates to demonstrate - the combine-shl changes add currently unsupported cases, while the MatchRotate replaces an existing mechanism.

Differential Revision: https://reviews.llvm.org/D35492

llvm-svn: 308598
2017-07-20 10:13:40 +00:00
Francis Visoiu Mistrih 185b2e3d32 Revert "[PEI] Simplify handling of targets with no phys regs. NFC"
This reverts commit ce30ab6e5598f3c24f59ad016dc9526bc9a1d450.

sanitizer-ppc64le-linux seems to segfault when testing the sanitizers.

llvm-svn: 308581
2017-07-20 02:47:05 +00:00
Francis Visoiu Mistrih b3ddc1686b Revert "[PEI] Separate saving and restoring CSRs into different functions. NFC"
This reverts commit 540f6a26ae932469804a379ce9a8cbe715d59c23.

sanitizer-ppc64le-linux seems to segfault when testing the sanitizers.

llvm-svn: 308580
2017-07-20 02:47:04 +00:00
Francis Visoiu Mistrih 303e5df4e2 [PEI] Separate saving and restoring CSRs into different functions. NFC
Split insertCSRSpillsAndRestores into insertCSRSaves + insertCSRRestores.

This is mostly useful for future shrink-wrapping improvements where we
want to save / restore a specific part of the CSRs in a specific block.

Differential Revision: https://reviews.llvm.org/D35644

llvm-svn: 308573
2017-07-20 00:58:37 +00:00
Matt Arsenault d62fe83005 Replace -print-whole-regmask with a threshold.
The previous flag/default of printing everything is
not helpful when there are thousands of registers
in the mask.

llvm-svn: 308572
2017-07-20 00:37:31 +00:00
Francis Visoiu Mistrih ede08ef314 Revert "[PEI] Separate saving and restoring CSRs into different functions. NFC"
This reverts commit a84d1fa6847e70ebf63594d41a00b473c941bd72.

llvm-svn: 308562
2017-07-20 00:08:02 +00:00
Francis Visoiu Mistrih 9b97a31870 [AsmPrinter] Constify needsCFIMoves. NFC
llvm-svn: 308557
2017-07-19 23:47:33 +00:00
Francis Visoiu Mistrih 52042aa21e [PEI] Add basic opt-remarks support
Add optimization remarks support to the PrologueEpilogueInserter. For
now, emit the stack size as an analysis remark, but more additions wrt
shrink-wrapping may be added.

https://reviews.llvm.org/D35645

llvm-svn: 308556
2017-07-19 23:47:32 +00:00
Francis Visoiu Mistrih a1f21bca46 [PEI] Simplify handling of targets with no phys regs. NFC
Make doSpillCalleeSavedRegs a member function, instead of passing most
of the members of PEI as arguments.

Differential Revision: https://reviews.llvm.org/D35642

llvm-svn: 308555
2017-07-19 23:47:32 +00:00
Francis Visoiu Mistrih 3b7bbdbdd5 [PEI] Separate saving and restoring CSRs into different functions. NFC
Split insertCSRSpillsAndRestores into insertCSRSaves + insertCSRRestores.

This is mostly useful for future shrink-wrapping improvements where we
want to save / restore a specific part of the CSRs in a specific block.

Differential Revision: https://reviews.llvm.org/D35644

llvm-svn: 308554
2017-07-19 23:47:31 +00:00
Derek Schuff 36454afab5 Move Runtime libcall definitions to a .def file
This will allow eliminating the duplication of the names, and allow adding
extra information such as signatures in a future commit.

Differential Revision: https://reviews.llvm.org/D35522

llvm-svn: 308531
2017-07-19 21:53:30 +00:00
Wolfgang Pieb e018bbd835 Fixing an issue with the initialization of LexicalScopes objects when mixing debug
and non-debug units.

Patch by Andrea DiBiagio.

Differential Revision:  https://reviews.llvm.org/D35637

llvm-svn: 308513
2017-07-19 19:36:40 +00:00
Simon Pilgrim c77e262260 {DAGCombine] Convert (Val & Mask) == Mask to Mask.isSubsetof(Val). NFCI.
llvm-svn: 308460
2017-07-19 13:39:58 +00:00
Serguei Katkov 4ea855ebe5 [CGP] Allow cycles during Phi traversal in OptimizaMemoryInst
Allowing cycles in Phi traversal increases the scope of optimize memory instruction
in case we are in loop.

The added test shows an example of enabling optimization inside a loop.

Reviewers: loladiro, spatel, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35294

llvm-svn: 308419
2017-07-19 04:49:17 +00:00
Adrian Prantl d63bfd218b Debug Info: Add a file: field to DIImportedEntity.
DIImportedEntity has a line number, but not a file field. To determine
the decl_line/decl_file we combine the line number from the
DIImportedEntity with the file from the DIImportedEntity's scope. This
does not work correctly when the parent scope is a DINamespace or a
DIModule, both of which do not have a source file.

This patch adds a file field to DIImportedEntity to unambiguously
identify the source location of the using/import declaration.  Most
testcase updates are mechanical, the interesting one is the removal of
the FIXME in test/DebugInfo/Generic/namespace.ll.

This fixes PR33822. See https://bugs.llvm.org/show_bug.cgi?id=33822
for more context.

<rdar://problem/33357889>
https://bugs.llvm.org/show_bug.cgi?id=33822

Differential Revision: https://reviews.llvm.org/D35583

llvm-svn: 308398
2017-07-19 00:09:54 +00:00
Nirav Dave d839749ae8 [DAG] Improve Aliasing of operations to static alloca
Re-recommiting after landing DAG extension-crash fix.

Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.

Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.

Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.

Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.

The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.

Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand

Reviewed By: rnk

Subscribers: sdardis, nemanjai, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33345

llvm-svn: 308350
2017-07-18 20:06:24 +00:00
Nirav Dave 041b87758a [DAG] Reverse node replacement in extension operation. NFCI.
Reorder replacements to be user first in preparation for multi-level
folding to premptively avoid inadvertantly deleting later nodes from
sharing found from replacement.

llvm-svn: 308348
2017-07-18 19:49:20 +00:00
Nirav Dave 07871007aa [DAG] Avoid deleting nodes before combining them.
When replacing a node and it's operand, replacing the operand node may
cause the deletion of the original node leading to an assertion
failure. Case around these replacements to avoid this without relying
on inspecting the DELETED_NODE opcode in various extend
dagcombiner cases.

Fixes PR32515.

Reviewers: dbabokin, RKSimon, davide, chandlerc

Subscribers: chandlerc, llvm-commits

Differential Revision: https://reviews.llvm.org/D34095

llvm-svn: 308330
2017-07-18 17:39:15 +00:00
Nirav Dave f87c8e82f6 [DAG] Allow base element type of store merge type to also be a vector.
Correctly calculate merged vector size if MemVT is already a vector.

llvm-svn: 308312
2017-07-18 14:39:09 +00:00
Simon Pilgrim 4793a11df9 [DAGCombine] Fix issue with out of bound constant rotation (PR33828)
Take the modulo of rotations by a constant greater than or equal to the bit-width

llvm-svn: 308302
2017-07-18 12:31:46 +00:00
Diana Picus df4100b3d2 GlobalISel: Support G_(S|U)REM widening in LegalizerHelper
Treat widening G_SREM and G_UREM the same as G_SDIV and G_UDIV. This is
going to be used in the ARM backend (and that's when the test will come
too).

llvm-svn: 308278
2017-07-18 09:08:47 +00:00
Chandler Carruth a15e080b05 Revert r308025 due to uncovering a crash in SelectionDAG. This is filed
with a minimal test case in http://llvm.org/PR33833.

Original commit message:
  Improve Aliasing of operations to static alloca

llvm-svn: 308271
2017-07-18 07:53:47 +00:00
Serguei Katkov a6fba3d69f [CGP] Cleanup - remove redundant code in OptimizeMemoryInst. NFC
optimizeMemoryInst contains a vector AddrModeInsts.
The only use of this vector is to check that all instructions are in the same
block as memory instruction. This check is guarded by PhiSeen flag,
so if we traversed through phi node then we do not need to keep information
in AddrModeInsts. AddModeInsts is set first time we found some addressing mode
and updated if we found new one later.
We can find next addressing mode only if we traverse phi node so all code
related to update of AddModeInsts can be safely removed.

Reviewers: loladiro, spatel, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35291

llvm-svn: 308265
2017-07-18 05:16:38 +00:00
Andrew Zhogin 67a64041b9 [DAGCombiner] Recognise vector rotations with non-splat constants
Fixes PR33691.

Differential revision: https://reviews.llvm.org/D35381

llvm-svn: 308150
2017-07-16 23:11:45 +00:00
Simon Pilgrim e7a2e6bdf1 Strip trailing whitespace. NFCI
llvm-svn: 308108
2017-07-15 19:29:19 +00:00
Haicheng Wu abdef9ee7e [TTI] Refine the cost of EXT in getUserCost()
Now, getUserCost() only checks the src and dst types of EXT to decide it is free
or not. This change first checks the types, then calls isExtFreeImpl(), and
check if EXT can form ExtLoad at last. Currently, only AArch64 has customized
implementation of isExtFreeImpl() to check if EXT can be folded into its use.

Differential Revision: https://reviews.llvm.org/D34458

llvm-svn: 308076
2017-07-15 02:12:16 +00:00
Dimitry Andric e4b97459f1 Fix mixed line terminators. NFC.
llvm-svn: 308052
2017-07-14 21:14:58 +00:00
Jakub Kuderski b292c22c8d [Dominators] Make IsPostDominator a template parameter
Summary:
DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere.

This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction.

Reviewers: dberlin, sanjoy, davide, grosser

Reviewed By: dberlin

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D35315

llvm-svn: 308040
2017-07-14 18:26:09 +00:00
Nirav Dave a8f63af9d1 Improve Aliasing of operations to static alloca
Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.

Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.

Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.

Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.

The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.

Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand

Reviewed By: rnk

Subscribers: sdardis, nemanjai, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33345

llvm-svn: 308025
2017-07-14 13:56:21 +00:00
Jakub Kuderski 1d2dc681b1 [NFC] Move DEBUG_TYPE macro below includes...
in MachineCombiner.cpp.

llvm-svn: 307940
2017-07-13 19:30:52 +00:00
Simon Dardis 250256f9c9 Reland "[mips] Fix multiprecision arithmetic."
For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC,
get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs.

For MIPS, only the DSP ASE has a carry flag, so in the general case it is not
useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes.

Also improve the generation code in such cases for targets with
TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the
comparison node rather than using it in selects. Similarly for ISD::SUBE /
ISD::SUBC.

Address optimization breakage by moving the generation of MIPS specific integer
multiply-accumulate nodes to before legalization.

This revolves PR32713 and PR33424.

Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue!

Reviewers: slthakur

Differential Revision: https://reviews.llvm.org/D33494

The previous version of this patch was too aggressive in producing fused
integer multiple-addition instructions.

llvm-svn: 307906
2017-07-13 11:28:05 +00:00
Simon Pilgrim bb85cb16e3 [DAGCombiner] Fix issue with rotate combines asserting if the constant value types differ from the result type.
llvm-svn: 307900
2017-07-13 10:41:49 +00:00
Simon Pilgrim 2dc42b7202 Use isNullConstantOrNullSplatConstant helper. NFCI.
llvm-svn: 307895
2017-07-13 09:39:00 +00:00
Hiroshi Inoue e9dea6e613 fix typos in comments and error messges; NFC
llvm-svn: 307885
2017-07-13 06:48:39 +00:00
Geoff Berry bea2e188e9 [TargetLowering] Add hook for adding target MMO flags when doing ISel.
Summary: Add TargetLowering hook getMMOFlags() to add target specific
MMO flags to load/store instructions created by ISel.

Reviewers: bogner, hfinkel, qcolombet, MatzeB

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34962

llvm-svn: 307879
2017-07-13 03:49:42 +00:00
Geoff Berry 6748abe24d [MIR] Add support for printing and parsing target MMO flags
Summary: Add target hooks for printing and parsing target MMO flags.
Targets may override getSerializableMachineMemOperandTargetFlags() to
return a mapping from string to flag value for target MMO values that
should be serialized/parsed in MIR output.

Add implementation of this hook for AArch64 SuppressPair MMO flag.

Reviewers: bogner, hfinkel, qcolombet, MatzeB

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34962

llvm-svn: 307877
2017-07-13 02:28:54 +00:00
Eli Friedman 6f7c9ad7d4 [CodeGenPrepare] Don't create dead instructions in addrmode sinking
When we fail to sink an instruction, we must make sure not to modify
the function; otherwise, we end up in an infinite loop because
CodeGenPrepare iterates until it doesn't make any changes.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33608 .

llvm-svn: 307866
2017-07-12 23:30:02 +00:00
Gerolf Hoflehner 3f164318e7 [SjLj] Replace recursive block marking algorithm with iterative algorithm
Summary:
Some programs run into a stack overflow issue. This change avoids this
problem by replacing the recursive algorithm with the iterative version.

Reviewers: MatzeB, t.p.northover, dblaikie

Reviewed By: MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35105

llvm-svn: 307860
2017-07-12 23:05:15 +00:00
Daniel Neilson 965613ef1b Add element atomic memset intrinsic
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memset intrinsic. This intrinsic is essentially memset with the implementation requirement that all stores used for the assignment are done with unordered-atomic stores of a given element size.

Reviewers: eli.friedman, reames, mkazantsev, skatkov

Reviewed By: reames

Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D34885

llvm-svn: 307854
2017-07-12 21:57:23 +00:00
Sam Clegg fd5ab25ae1 Remove unneeded use of #undef DEBUG_TYPE. NFC
Where is is needed (at the end of headers that define it), be
consistent about its use.

Also fix a few header guards that I found in the process.

Differential Revision: https://reviews.llvm.org/D34916

llvm-svn: 307840
2017-07-12 20:49:21 +00:00
Evandro Menezes 14ba3d7730 [CodeGen] Add dependency printer
Add SDep printer to make debugging sessions more productive.

Differential revision: https://reviews.llvm.org/D35144

llvm-svn: 307799
2017-07-12 15:30:59 +00:00
Daniel Neilson 57226ef33c Add element atomic memmove intrinsic
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size.

Reviewers: eli.friedman, reames, mkazantsev, skatkov

Reviewed By: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34884

llvm-svn: 307796
2017-07-12 15:25:26 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Evandro Menezes 0cd23f5642 [CodeGen] Rename DEBUG_TYPE to match passnames
Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were
absent from https://reviews.llvm.org/rL303921.

Differential revision: https://reviews.llvm.org/D35231

llvm-svn: 307719
2017-07-11 22:08:28 +00:00
Serguei Katkov 0e831c996c Revert Revert [MBP] do not rotate loop if it creates extra branch
This is a second attempt to land this patch.

The first one resulted in a crash of clang sanitizer buildbot.
The fix is here and regression test is added.

This is a last fix for the corner case of PR32214. Actually this is not really corner case in general.

We should not do a loop rotation if we create an additional branch due to it.
Consider the case where we have a loop chain H, M, B, C , where
H is header with viable fallthrough from pre-header and exit from the loop
M - some middle block
B - backedge to Header but with exit from the loop also.
C - some cold block of the loop.

Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch.
Let's compute the change in number of branches:
+1 branch from pre-header to header
-1 branch from header to exit
+1 branch from header to middle block if there is such
-1 branch from cold bock to header if there is one

So if C is not a predecessor of H then we introduce extra branch.

This change actually prohibits rotation of the loop if both true
  Best Exit has next element in chain as successor.
  Last element in chain is not a predecessor of first element of chain.

Reviewers: iteratee, xur, sammccall, chandlerc	
Reviewed By: iteratee
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34745

llvm-svn: 307631
2017-07-11 08:34:58 +00:00