Commit Graph

22369 Commits

Author SHA1 Message Date
Nirav Dave 6de2c77944 Capitalize ArgListEntry fields. NFC.
llvm-svn: 298178
2017-03-18 00:43:57 +00:00
Evgeniy Stepanov 51c962f72e Add !associated metadata.
This is an ELF-specific thing that adds SHF_LINK_ORDER to the global's section
pointing to the metadata argument's section. The effect of that is a reverse dependency
between sections for the linker GC.

!associated does not change the behavior of global-dce. The global
may also need to be added to llvm.compiler.used.

Since SHF_LINK_ORDER is per-section, !associated effectively enables
fdata-sections for the affected globals, the same as comdats do.

Differential Revision: https://reviews.llvm.org/D29104

llvm-svn: 298157
2017-03-17 22:17:24 +00:00
Eli Friedman 46ddab3810 [SelectionDAG] Remove redundant stores more aggressively.
Handle TokenFactors more aggressively in
SDValue::reachesChainWithoutSideEffects.  This isn't really a
very effective change anymore because of other changes to
chain handling, but it's a cheap check, and the expanded
comments are still useful.

It might be possible to loosen the hasOneUse() requirement with a
deeper analysis, but a naive implementation of that check would be
expensive.

Differential Revision: https://reviews.llvm.org/D29845

llvm-svn: 298156
2017-03-17 22:15:50 +00:00
Jun Bum Lim 4230101def [CodeGenPrep]Restructure promoting Ext to form ExtLoad
Summary:
Instead of just looking for a load which is mergable with Ext to form ExtLoad, trying to promote Exts as long as the cost is acceptable. This change is not a NFC as it continue promoting Exts even after finding a load during promotions; the change in arm64-codegen-prepare-extload.ll described in 2.b might show the case.
This change was motivated from D26524.  Based on this change, I will move the transformation performed in aarch64-type-promotion into CGP.

Reviewers: jmolloy, qcolombet, mcrosier, javed.absar

Reviewed By: qcolombet

Subscribers: rengolin, llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D27853

llvm-svn: 298114
2017-03-17 19:05:21 +00:00
Simon Pilgrim 5a68d401c7 [SelectionDAG] Add SelectionDAG.computeKnownBits test support for ISD::ABS
llvm-svn: 298108
2017-03-17 17:45:36 +00:00
Matthias Braun f0b68d3fbc SplitKit: Correctly implement partial subregister copies
- This fixes a bug where subregister incompatible with the vregs register
  class where used.
- Implement the case where multiple copies are necessary to cover a
  given lanemask.

Differential Revision: https://reviews.llvm.org/D30438

llvm-svn: 298025
2017-03-17 00:41:39 +00:00
Matthias Braun fa289ec7f0 VirtRegMap: Correctly deal with bundles when deleting identity copies.
This fixes two problems when VirtRegMap encounters bundles:

- When substituting a vreg subregister def with an actual register the
  internal read flag must be cleared.
- Removing an identity COPY from a bundle needs to use
  removeFromBundle() and a newly introduced function to update
  SlotIndexes.

No testcase here, because none of the in-tree targets trigger this,
however an upcoming commit of mine will need this and the testcase there
will trigger this.

Differential Revision: https://reviews.llvm.org/D30925

llvm-svn: 298024
2017-03-17 00:41:33 +00:00
Eric Christopher 53da761570 Remove LessPreciseFPMADOption from TargetOptions along with all of the
associated command line options and functions - it's currently unused
in all of llvm and clang other than being set and reset.

llvm-svn: 298023
2017-03-17 00:38:03 +00:00
Reid Kleckner 45707d4d5a Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.

In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).

Reviewed By: chandlerc

Differential Revision: https://reviews.llvm.org/D31052

llvm-svn: 298010
2017-03-16 22:59:15 +00:00
Simon Pilgrim fbfb19b1d7 Remove redundant conditions (PR31753). NFCI.
llvm-svn: 297976
2017-03-16 19:52:00 +00:00
Adrian Prantl dc855221af Attempt to fix bot failure on Windows.
Looks like this expression was accidentally using 32-bit arithmetic.

llvm-svn: 297969
2017-03-16 18:06:04 +00:00
Adrian Prantl 3621309e8d Rearrange fields. NFC.
llvm-svn: 297967
2017-03-16 17:42:47 +00:00
Adrian Prantl a63b8e8227 Rename methods in DwarfExpression to adhere to the LLVM coding guidelines.
NFC.

llvm-svn: 297966
2017-03-16 17:42:45 +00:00
Adrian Prantl 981f03e6a2 PR32288: More efficient encoding for DWARF expr subregister access.
Citing http://bugs.llvm.org/show_bug.cgi?id=32288

  The DWARF generated by LLVM includes this location:

  0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply
  0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable
  to assume the DWARF consumer knows which part of a register
  logically holds the value (low bytes, high bytes, how many bytes,
  etc) for a primitive value like an integer.

This patch gets rid of the redundant DW_OP_piece when a subregister is
at offset 0. It also adds previously missing subregister masking when
a subregister is followed by another operation.

(This reapplies r297960 with two additional testcase updates).

rdar://problem/31069390
https://reviews.llvm.org/D31010

llvm-svn: 297965
2017-03-16 17:14:56 +00:00
Adrian Prantl c5b3351750 Revert "PR32288: More efficient encoding for DWARF expr subregister access."
This reverts commit 2bf453116889a576956892ea9683db4fcd96e30e while investigating buildbot breakage.

llvm-svn: 297962
2017-03-16 16:38:22 +00:00
Adrian Prantl 8508e87998 PR32288: More efficient encoding for DWARF expr subregister access.
Citing http://bugs.llvm.org/show_bug.cgi?id=32288

  The DWARF generated by LLVM includes this location:

  0x55 0x93 0x04 DW_OP_reg5 DW_OP_piece(4) When GCC's DWARF is simply
  0x55 (DW_OP_reg5) without the DW_OP_piece. I believe it's reasonable
  to assume the DWARF consumer knows which part of a register
  logically holds the value (low bytes, high bytes, how many bytes,
  etc) for a primitive value like an integer.

This patch gets rid of the redundant DW_OP_piece when a subregister is
at offset 0. It also adds previously missing subregister masking when
a subregister is followed by another operation.

rdar://problem/31069390
https://reviews.llvm.org/D31010

llvm-svn: 297960
2017-03-16 16:34:14 +00:00
Oren Ben Simhon da59ffae91 Fixing typos.
llvm-svn: 297932
2017-03-16 08:15:52 +00:00
Jonas Paulsson 84319bfc40 [SelectionDAG] Optimize VSELECT->SETCC of incompatible or illegal types.
Don't scalarize VSELECT->SETCC when operands/results needs to be widened,
or when the type of the SETCC operands are different from those of the VSELECT.

(VSELECT SETCC) and (VSELECT (AND/OR/XOR (SETCC,SETCC))) are handled.

The previous splitting of VSELECT->SETCC in DAGCombiner::visitVSELECT() is
no longer needed and has been removed.

Updated tests:

test/CodeGen/ARM/vuzp.ll
test/CodeGen/NVPTX/f16x2-instructions.ll
test/CodeGen/X86/2011-10-19-widen_vselect.ll
test/CodeGen/X86/2011-10-21-widen-cmp.ll
test/CodeGen/X86/psubus.ll
test/CodeGen/X86/vselect-pcmp.ll

Review: Eli Friedman, Simon Pilgrim
https://reviews.llvm.org/D29489

llvm-svn: 297930
2017-03-16 07:17:12 +00:00
Kyle Butt 08655997eb CodeGen: BlockPlacement: Reduce TriangleChainCount to 2
This produces a 1% speedup on an important internal Google benchmark
(protocol buffers), with no other regressions in google or in the llvm
test-suite. Only 5 targets in the entire llvm test-suite are affected,
and on those 5 targets the size increase is 0.027%

llvm-svn: 297925
2017-03-16 01:32:29 +00:00
Craig Topper 6c66bbca4a [StackColoring] Remove unused header file for post-order traversal. Update comment that indicated we were using it when we really use a depth-first search. NFC
llvm-svn: 297904
2017-03-15 22:40:26 +00:00
Matt Arsenault 02d915be90 CodeGenPrepare: Sink addressing modes for atomics
llvm-svn: 297903
2017-03-15 22:35:20 +00:00
Eric Christopher 17ce8a2f5e Fix up grammar in a comment.
llvm-svn: 297898
2017-03-15 21:50:46 +00:00
Zvi Rackover 48cdde0e59 [DAGCombine] Bail out if can't create a vector with at least two elements
Summary:

Fixes pr32278

Reviewers: igorb, craig.topper, RKSimon, spatel, hfinkel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30978

llvm-svn: 297878
2017-03-15 19:48:36 +00:00
Ahmed Bougacha 2fb8030748 [GlobalISel] Avoid translating synthetic constants to new G_CONSTANTS.
Currently, we create a G_CONSTANT for every "synthetic" integer
constant operand (for instance, for the G_GEP offset).
Instead, share the G_CONSTANTs we might have created by going through
the ValueToVReg machinery.

When we're emitting synthetic constants, we do need to get Constants from
the context.  One could argue that we shouldn't modify the context at
all (for instance, this means that we're going to use a tad more memory
if the constant wasn't used elsewhere), but constants are mostly
harmless.  We currently do this for extractvalue and all.

For constant fcmp, this does mean we'll emit an extra COPY, which is not
necessarily more optimal than an extra materialized constant.
But that preserves the current intended design of uniqued G_CONSTANTs,
and the rematerialization problem exists elsewhere and should be
resolved with a single coherent solution.

llvm-svn: 297875
2017-03-15 19:21:11 +00:00
Tim Northover 0d98b03b9f ARM: avoid clobbering register in v6 jump-table expansion.
If we got unlucky with register allocation and actual constpool placement, we
could end up producing a tTBB_JT with an index that's already been clobbered.

Technically, we might be able to fix this situation up with a MOV, but I think
the constant islands pass is complex enough without having to deal with more
weird edge-cases.

llvm-svn: 297871
2017-03-15 18:38:13 +00:00
Ahmed Bougacha 07f247b6c2 [GlobalISel] Insert translated switch icmp blocks after switch parent.
Now that we preserve the IR layout, we would end up with all the newly
synthesized switch comparison blocks at the end of the function.
Instead, use a hopefully more reasonable layout, with the comparison
blocks immediately following the switch comparison blocks.

llvm-svn: 297869
2017-03-15 18:22:37 +00:00
Ahmed Bougacha a61c214f51 [GlobalISel] Preserve IR block layout.
It makes the output function layout more predictable;  the layout has
an effect on performance, we don't want it to be at the mercy of the
translator's visitation order and such.
The predictable output is also easier to digest.

getOrCreateBB isn't appropriately named anymore, as it never needs to
create anything.  Rename it and extract the MBB creation logic out of it.

A couple tests were sensitive to the order. Update them.

llvm-svn: 297868
2017-03-15 18:22:33 +00:00
Craig Topper bcb6093610 [CodeGen] Use APInt::setLowBits/setHighBits/setBitsFrom in more places
This patch replaces ORs with getHighBits/getLowBits etc. with setLowBits/setHighBits/setBitsFrom.

In a few of the places we weren't ORing, but the KnownZero/KnownOne vectors were already initialized to zero. We exploit this in most places already there were just some that were inconsistent.

Differential Revision: https://reviews.llvm.org/D30965

llvm-svn: 297860
2017-03-15 16:53:53 +00:00
Ahmed Bougacha 2b7f1377aa [GlobalISel] Remove dead member. NFC.
llvm-svn: 297855
2017-03-15 16:29:32 +00:00
Peter Collingbourne d44a01aae6 CodeGen: Use the source filename as the argument to .file, rather than the module ID.
Using the module ID here is wrong for a couple of reasons:
1) The module ID is not persisted, so we can end up with different
   object file contents given the same input file (for example if the same
   file is accessed via different paths).
2) With ThinLTO the module ID field may contain the path to a bitcode file,
   which is incorrect, as the .file argument is supposed to contain the path to
   a source file.

Differential Revision: https://reviews.llvm.org/D30584

llvm-svn: 297853
2017-03-15 16:24:52 +00:00
Simon Pilgrim 018eedd9a5 [SelectionDAG] Support BUILD_VECTOR implicit truncation in SelectionDAG::ComputeNumSignBits (PR32273)
llvm-svn: 297852
2017-03-15 16:22:24 +00:00
Nuno Lopes ae455c562d fix gcc -Wmisleading-indentation [NFC]
llvm-svn: 297816
2017-03-15 09:33:33 +00:00
Taewook Oh 1b192336d8 NFC: Reformats comments according to the coding guildelines.
llvm-svn: 297808
2017-03-15 06:29:23 +00:00
Taewook Oh fb1833efeb [BranchFolding] Merge debug locations from common tail instead of removing
Summary: D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions.

Reviewers: aprantl, probinson, MatzeB, rob.lougher

Reviewed By: aprantl

Subscribers: andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D30226

llvm-svn: 297805
2017-03-15 05:44:59 +00:00
Peter Collingbourne 7f6e2c97b8 Ensure that prefix data is preserved with subsections-via-symbols
On MachO platforms that use subsections-via-symbols dead code stripping will
drop prefix data. Unfortunately there is no great way to convey the relationship
between a function and its prefix data to the linker. We are forced to use a bit
of a hack: we give the prefix data it’s own symbol, and mark the actual function
entry an .alt_entry.

Patch by Moritz Angermann!

Differential Revision: https://reviews.llvm.org/D30770

llvm-svn: 297804
2017-03-15 04:18:16 +00:00
Volkan Keles 4862c63594 [GlobalISel] IRTranslator: Return the scalar for <1 x Ty> constant vectors
Summary:
<1 x Ty> is not a legal vector type in LLT, we shouldn’t build G_MERGE_VALUES
instruction for them.

Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab, javed.absar

Reviewed By: qcolombet

Subscribers: dberris, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D30948

llvm-svn: 297792
2017-03-14 23:45:06 +00:00
Daniel Sanders 8a4bae9993 [globalisel][tblgen] Add support for ComplexPatterns
Summary:
Adds a new kind of MachineOperand: MO_Placeholder.
This operand must not appear in the MIR and only exists as a way of
creating an 'uninitialized' operand until a matcher function overwrites it.

Depends on D30046, D29712

Reviewers: t.p.northover, ab, rovka, aditya_nandakumar, javed.absar, qcolombet

Reviewed By: qcolombet

Subscribers: dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D30089

llvm-svn: 297782
2017-03-14 21:32:08 +00:00
Simon Pilgrim cf2da96c82 [SelectionDAG] Add a signed integer absolute ISD node
Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering.

ARM/AArch64, Hexagon, PowerPC and NVPTX all have similar instructions allowing us to make this a generic opcode and move away from the hard coded tablegen patterns which makes it tricky to match more complex patterns.

At the moment this patch doesn't attempt legalization as we only create an ABS node if its legal/custom.

Differential Revision: https://reviews.llvm.org/D29639

llvm-svn: 297780
2017-03-14 21:26:58 +00:00
Sanjay Patel 8dd99dce6c [DAG] vector div/rem with any zero element in divisor is undef
This is the backend counterpart to:
https://reviews.llvm.org/rL297390
https://reviews.llvm.org/rL297409
and follow-up to:
https://reviews.llvm.org/rL297384

It surprised me that we need to duplicate the check in FoldConstantArithmetic and FoldConstantVectorArithmetic, 
but one or the other doesn't catch all of the test cases. There is an existing code comment about merging those 
someday.

Differential Revision: https://reviews.llvm.org/D30826

llvm-svn: 297762
2017-03-14 18:06:28 +00:00
Benjamin Kramer babcbddae0 [CodeGen] Fix -Wreorder warning.
llvm-svn: 297729
2017-03-14 10:29:47 +00:00
Sam Parker 916b1ba617 [ARM] Move SMULW[B|T] isel to DAG Combine
Create nodes for smulwb and smulwt and move their selection from
DAGToDAG to DAG combine. smlawb and smlawt can then be selected
using tablegen. Added some helper functions to detect shift patterns
as well as a wrapper around SimplifyDemandBits. Added a couple of
extra tests.

Differential Revision: https://reviews.llvm.org/D30708

llvm-svn: 297716
2017-03-14 09:13:22 +00:00
Oren Ben Simhon fe34c5e429 Disable Callee Saved Registers
Each Calling convention (CC) defines a static list of registers that should be preserved by a callee function. All other registers should be saved by the caller.
Some CCs use additional condition: If the register is used for passing/returning arguments – the caller needs to save it - even if it is part of the Callee Saved Registers (CSR) list.
The current LLVM implementation doesn’t support it. It will save a register if it is part of the static CSR list and will not care if the register is passed/returned by the callee.
The solution is to dynamically allocate the CSR lists (Only for these CCs). The lists will be updated with actual registers that should be saved by the callee.
Since we need the allocated lists to live as long as the function exists, the list should reside inside the Machine Register Info (MRI) which is a property of the Machine Function and managed by it (and has the same life span).
The lists should be saved in the MRI and populated upon LowerCall and LowerFormalArguments.
The patch will also assist to implement future no_caller_saved_regsiters attribute intended for interrupt handler CC.

Differential Revision: https://reviews.llvm.org/D28566

llvm-svn: 297715
2017-03-14 09:09:26 +00:00
Nirav Dave 4fc8401abf Recommitting Craig Topper's patch now that r296476 has been recommitted.
When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node.

This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency.

llvm-svn: 297698
2017-03-14 01:42:23 +00:00
Nirav Dave 54e22f33d9 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Recommiting with compiler time improvements

    Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

llvm-svn: 297695
2017-03-14 00:34:14 +00:00
Adrian Prantl 19aadf57c8 Revert "Debug Info: Add basic support for external types references."
This reverts commit r242302. External type refs of this form were
never used by any LLVM frontend so this is effectively dead code.
(They were introduced to support clang module debug info, but in the
end we came up with a better design that doesn't use this feature at
all.)

rdar://problem/25897929

Differential Revision: https://reviews.llvm.org/D30917

llvm-svn: 297684
2017-03-13 22:56:14 +00:00
Marcello Maggioni 598d89a3f4 [IPRA] Change algorithm for RegUsageInfoCollector.
The previous algorithm for RegUsageInfoCollector had pretty bad
performance on architectures with a lot of registers that alias
a lot one another, because we potentially iterate for every register
over all the aliasing registers. This costs even more if the function
is small and doesn't define a lot of registers.
This patch changes the algorithm to one that while iterating over
all the registers it will iterate over the aliasing registers only
if the register itself is defined.
This should be faster based on the assumption that only a subset
of the whole LLVM registers set is actually defined in the function.

Differential Revision: https://reviews.llvm.org/D30880

llvm-svn: 297673
2017-03-13 21:42:53 +00:00
Volkan Keles 38a91a0de6 GlobalISel: Translate ConstantDataVector
Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, javed.absar, ab

Reviewed By: qcolombet, dsanders, ab

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30216

llvm-svn: 297670
2017-03-13 21:36:19 +00:00
Jessica Paquette c984e21394 [Outliner] Add tail call support
This commit adds tail call support to the MachineOutliner pass. This allows
the outliner to insert jumps rather than calls in areas where tail calling is
possible. Outlined tail calls include the return or terminator of the basic
block being outlined from.

Tail call support allows the outliner to take returns and terminators into
consideration while finding candidates to outline. It also allows the outliner
to save more instructions. For example, in the X86-64 outliner, a tail called
outlined function saves one instruction since no return has to be inserted.

llvm-svn: 297653
2017-03-13 18:39:33 +00:00
Simon Pilgrim fa97699d09 Fix -Wsentinel warning
llvm-svn: 297560
2017-03-11 12:56:02 +00:00
Amaury Sechet d1ec5d54cf Use setBits in SelectionDAG
Summary: As per title.

Reviewers: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30836

llvm-svn: 297559
2017-03-11 11:24:03 +00:00
Quentin Colombet ee8a4f51c4 [IRTranslator] Simplify error handling for translating constants. NFC.
We don't need to check whether the fallback path is enabled to return
false. Just do that all the time on error cases, the caller knows (or
at least should know!) how to handle the failing case.

llvm-svn: 297535
2017-03-11 00:28:33 +00:00
Stanislav Mekhanoshin b546174b0e Fix subreg value numbers in handleMoveUp
The problem can occur in presence of subregs. If we are swapping two
instructions defining different subregs of the same register we will
get a new liveout from a block. We need to preserve value number for
block's liveout for successor block's livein to match.

Differential Revision: https://reviews.llvm.org/D30558

llvm-svn: 297534
2017-03-11 00:14:52 +00:00
Simon Pilgrim 455e2f3313 Strip trailing whitespace.
llvm-svn: 297529
2017-03-10 22:53:19 +00:00
Simon Pilgrim 83c37c4dac Fix redundant condition (PR32138)
'!A || (A && B)' is equivalent to '!A || B'

llvm-svn: 297527
2017-03-10 22:44:47 +00:00
Volkan Keles 225921ac41 [GlobalISel] LegalizerHelper: Lower (G_FSUB X, Y) to (G_FADD X, (G_FNEG Y))
Summary: No test case as none of the in-tree targets with GlobalISel support has this condition.

Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, ab

Reviewed By: qcolombet

Subscribers: dberris, rovka, kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D30786

llvm-svn: 297512
2017-03-10 21:25:09 +00:00
Volkan Keles 970fee4bfe GlobalISel: Translate ConstantAggregateZero vectors
Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar

Reviewed By: qcolombet

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30259

llvm-svn: 297509
2017-03-10 21:23:13 +00:00
Volkan Keles 04cb08cc83 [GlobalISel] Translate insertelement and extractelement
Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, javed.absar

Reviewed By: qcolombet

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30761

llvm-svn: 297495
2017-03-10 19:08:28 +00:00
Simon Pilgrim 7dedbfa89d [SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBits
llvm-svn: 297492
2017-03-10 18:36:46 +00:00
Volkan Keles 685fbda217 [GlobalISel] Make LegalizerInfo accessible in LegalizerHelper
Summary:
We don’t actually use LegalizerInfo in Legalizer pass, it’s just passed
as an argument.

In order to check if an instruction is legal or not, we need to get LegalizerInfo
by calling `MI.getParent()->getParent()->getSubtarget().getLegalizerInfo()`.
Instead, make LegalizerInfo accessible in LegalizerHelper.

Reviewers: qcolombet, aditya_nandakumar, dsanders, ab, t.p.northover, kristof.beyls

Reviewed By: qcolombet

Subscribers: dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D30838

llvm-svn: 297491
2017-03-10 18:34:57 +00:00
Amaury Sechet 62e0759d56 [SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO and SUBC.
Summary:
Depends on D30379

This improves the state of things for the sub class of operation.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30436

llvm-svn: 297482
2017-03-10 17:26:44 +00:00
Amaury Sechet 69fa16c810 [SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO.
Summary: As per title. This is extracted from D29872 and I threw SADDO in.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30379

llvm-svn: 297479
2017-03-10 17:06:52 +00:00
Simon Pilgrim b02667c469 [APInt] Add APInt::insertBits() method to insert an APInt into a larger APInt
We currently have to insert bits via a temporary variable of the same size as the target with various shift/mask stages, resulting in further temporary variables, all of which require the allocation of memory for large APInts (MaskSizeInBits > 64).

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::insertBits() helper method which avoids the temporary memory allocation and masks/inserts the raw bits directly into the target.

Differential Revision: https://reviews.llvm.org/D30780

llvm-svn: 297458
2017-03-10 13:44:32 +00:00
Ahmed Bougacha d22b84b9d0 [GlobalISel] Use ImmutableCallSite instead of templates. NFC.
ImmutableCallSite abstracts away CallInst and InvokeInst. Use it!

llvm-svn: 297426
2017-03-10 00:25:44 +00:00
Ahmed Bougacha 4ec6d5abed [GlobalISel] Fallback when failing to translate invoke.
We unintentionally stopped falling back in r293670.

While there, change an unusual construct.

llvm-svn: 297425
2017-03-10 00:25:35 +00:00
Tim Northover aa995c98f4 GlobalISel: support trivial inlineasm calls.
They're used for nefarious purposes by ObjC.

llvm-svn: 297422
2017-03-09 23:36:26 +00:00
Eli Friedman 93f47e5ffb Refactor alias check from MISched into common helper. NFC.
Differential Revision: https://reviews.llvm.org/D30598

llvm-svn: 297421
2017-03-09 23:33:36 +00:00
Amaury Sechet e7d102cf02 [DAGCombiner] Do various combine on uaddo.
Summary: This essentially does the same transform as for ADC.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30417

llvm-svn: 297416
2017-03-09 22:47:00 +00:00
Tim Northover d1e951e5eb GlobalISel: inform FrameLowering when we emit a function call.
Amongst other things (I expect) this is necessary to ensure decent backtraces
when an "unreachable" is involved.

llvm-svn: 297413
2017-03-09 22:00:39 +00:00
Tim Northover 7a9ea8f628 GlobalISel: put debug info for static allocas in the MachineFunction.
The good reason to do this is that static allocas are pretty simple to handle
(especially at -O0) and avoiding tracking DBG_VALUEs throughout the pipeline
should give some kind of performance benefit.

The bad reason is that the debug pipeline is an unholy mess of implicit
contracts, where determining whether "DBG_VALUE %reg, imm" actually implies a
load or not involves the services of at least 3 soothsayers and the sacrifice
of at least one chicken.  And it still gets it wrong if the variable is at SP
directly.

llvm-svn: 297410
2017-03-09 21:12:06 +00:00
Amaury Sechet 10425de063 [DAGCombiner] Do various combine on usubo.
Summary: This essentially does the same transform as for SUBC.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30437

llvm-svn: 297404
2017-03-09 19:28:00 +00:00
Sanjay Patel df21979db7 [DAG] recognize div/rem by 0 as undef before trying constant folding
As discussed in the review thread for rL297026, this is actually 2 changes that 
would independently fix all of the test cases in the patch:

1. Return undef in FoldConstantArithmetic for div/rem by 0.
2. Move basic undef simplifications for div/rem (simplifyDivRem()) before 
   foldBinopIntoSelect() as a matter of efficiency.

I will handle the case of vectors with any zero element as a follow-up. That change
is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic().

I'm deleting the test for PR30693 because it does not test for the actual bug any more
(dangers of using bugpoint).

Differential Revision:
https://reviews.llvm.org/D30741

llvm-svn: 297384
2017-03-09 15:02:25 +00:00
Adam Nemet 5361b82d54 [SSP] In opt remarks, stream Function directly
With this, it shows up as an attribute in YAML and non-printable characters
are properly removed by GlobalValue::getRealLinkageName.

llvm-svn: 297362
2017-03-09 06:10:27 +00:00
Matt Arsenault 9a3fd87523 DAG: Check no signed zeros instead of unsafe math attribute
llvm-svn: 297354
2017-03-09 01:36:39 +00:00
Konstantin Zhuravlyov d5561e0a0b [DebugInfo] Emit address space with DW_AT_address_class attribute for pointer and reference types
Differential Revision: https://reviews.llvm.org/D29670

llvm-svn: 297320
2017-03-08 23:55:44 +00:00
Jessica Paquette d4cb9c6da0 [Outliner] Fix memory leak in suffix tree.
This commit changes the BumpPtrAllocator for suffix tree nodes to a SpecificBumpPtrAllocator.
Before, node construction was leaking memory because of the DenseMap in SuffixTreeNodes.
Changing this to a SpecificBumpPtrAllocator allows this memory to properly be released.

llvm-svn: 297319
2017-03-08 23:55:33 +00:00
Tim Northover 7596bd7a27 GlobalISel: correctly handle trivial fcmp predicates.
It makes sense to only do them once in IRTranslator rather than making everyone
deal with them.

llvm-svn: 297304
2017-03-08 18:49:54 +00:00
Volkan Keles 5698b2ae6e [GlobalISel] Add default action for G_FNEG
Summary: rL297171 introduced G_FNEG for floating-point negation instruction and IRTranslator started to translate `FSUB -0.0, X` to `FNEG X`. This patch adds a default action for G_FNEG to avoid breaking existing targets.

Reviewers: qcolombet, ab, kristof.beyls, t.p.northover, aditya_nandakumar, dsanders

Reviewed By: qcolombet

Subscribers: dberris, rovka, llvm-commits

Differential Revision: https://reviews.llvm.org/D30721

llvm-svn: 297301
2017-03-08 18:09:14 +00:00
Eli Friedman c2c2e21d77 [DAGCombine] Simplify ISD::AND in GetDemandedBits.
This helps in cases involving bitfields where an AND is exposed by
legalization.

Differential Revision: https://reviews.llvm.org/D30472

llvm-svn: 297249
2017-03-08 00:56:35 +00:00
Konstantin Zhuravlyov f9b41cd3d8 [DebugInfo] Make legal and emit DW_OP_swap and DW_OP_xderef
Differential Revision: https://reviews.llvm.org/D29672

llvm-svn: 297247
2017-03-08 00:28:57 +00:00
Daniel Sanders 1351db49b2 Fix additional constructor call missed by r297241.
It was added between my build+test and my commit.

llvm-svn: 297244
2017-03-07 23:32:10 +00:00
Daniel Sanders 52b4ce727a Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

llvm-svn: 297241
2017-03-07 23:20:35 +00:00
Tim Northover 542d1c1463 GlobalISel: use inserts for landingpad instead of sequences.
llvm-svn: 297237
2017-03-07 23:04:06 +00:00
Tim Northover 2eb18d3c4b GlobalISel: fix legalization of G_INSERT
We were calculating incorrect extract/insert offsets by trying to be too
tricksy with min/max. It's clearer to just split the logic up into "register
starts before this segment" vs "after".

llvm-svn: 297226
2017-03-07 21:24:33 +00:00
Yaron Keren 75fadfc774 Implement FreeMachineFunction::getPassName().
llvm-svn: 297222
2017-03-07 20:59:08 +00:00
Ahmed Bougacha 55d10423a6 [GlobalISel] Don't translate intrinsics with metadata parameters.
Some intrinsics take metadata parameters.  These all need custom
handling of some form, and cannot possibly be lowered generically to
G_INTRINSIC calls with vreg operands.
Reject them, instead of hitting an assert later in getOrCreateVReg.

llvm-svn: 297209
2017-03-07 20:53:09 +00:00
Ahmed Bougacha 5c7924fca5 [GlobalISel] Avoid invalidating ValToVReg when translating no-op bitcast.
When we translate a no-op (same type) bitcast, we try to be clever and
only emit a COPY if we already assigned a vreg to the defined value.
However, when we didn't, we tried to assign to a reference into the
ValToVReg DenseMap, even though the RHS of the assignment
(getOrCreateVReg) could potentially grow that DenseMap, invalidating the
reference.

Avoid that by getting the source vreg first.
I audited the rest of the translator; this is the only tricky case.

The test is quite unwieldy, as the problem is caused by the DenseMap
growing, which happens after the 47th mapped value.

llvm-svn: 297208
2017-03-07 20:53:06 +00:00
Ahmed Bougacha 38455ea8a6 [GlobalISel] Relax vector G_SELECT assertion.
For vector operands, the `select` instruction supports both vector and
non-vector conditions.  The MIR builder had an overly restrictive
assertion, that only accepted vector conditions for vector selects
(in effect implementing ISD::VSELECT).

Make it possible to express the full range of G_SELECTs.

llvm-svn: 297207
2017-03-07 20:53:03 +00:00
Ahmed Bougacha adce3ee219 [GlobalISel] Slightly clean up DBG_VALUE FP build code.
I messed up my rebases leading to r297200, and ended up with stale (but
working) code.  Fix it.

llvm-svn: 297205
2017-03-07 20:52:57 +00:00
Ahmed Bougacha c373262d52 [GlobalISel] Ignore %noreg when applying default regbank mapping.
When computing the mapping for non-generic instructions, we skipped
%noreg operands, because we can't always reason about their banks.

Also skip them when applying the mapping.  Otherwise, we could end
up with mappings that we can't apply.

While there, duplicate an assert to distinguish between the two
error conditions.

llvm-svn: 297201
2017-03-07 20:34:23 +00:00
Ahmed Bougacha 4826bae8b4 [GlobalISel] Emit DBG_VALUE %noreg for non-int/fp constant values.
When a dbg_value has a constant operand that isn't representable in MI,
there isn't much we can do.  Use %noreg (0) for those situations.
This matches the SelectionDAG behavior.

llvm-svn: 297200
2017-03-07 20:34:20 +00:00
Arnold Schwaighofer 69e74b48f2 SjLjEHPrepare: Fix the pass for swifterror arguments
We cannot leave the identity copies 'select true, arg, undef' that this pass
inserts for arguments to simplify handling of values on swifterror arguments.

swifterror arguments have restrictions on their uses.

rdar://30839288

llvm-svn: 297197
2017-03-07 20:29:02 +00:00
Daniel Sanders 8ebec37d26 Revert r297177: Change LLT constructor string into an LLT-based object ...
More module problems. This time it only showed up in the stage 2 compile of
clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile.

Somehow, this change causes the build to need Attributes.gen before it's been
generated.

llvm-svn: 297188
2017-03-07 19:21:23 +00:00
Daniel Sanders 8612326a08 [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

llvm-svn: 297177
2017-03-07 18:32:25 +00:00
Volkan Keles 20d3c4200d [GlobalISel] Translate floating-point negation
Reviewers: qcolombet, javed.absar, aditya_nandakumar, dsanders, t.p.northover, ab

Reviewed By: qcolombet

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30671

llvm-svn: 297171
2017-03-07 18:03:28 +00:00
Tim Northover c2c545b8f7 GlobalISel: restrict G_EXTRACT instruction to just one operand.
A bit more painful than G_INSERT because it was more widely used, but this
should simplify the handling of extract operations in most locations.

llvm-svn: 297100
2017-03-06 23:50:28 +00:00
Sanjay Patel 7f18ec50ba [DAG] refactor related div/rem folds; NFCI
This is known incomplete and not called in the right order relative to
other folds, but that's the current behavior. I'm just trying to clean
this up before making actual functional changes to make the patch smaller.

The logic here should mimic the IR equivalents that are in InstSimplify's
simplifyDivRem().

llvm-svn: 297086
2017-03-06 22:32:40 +00:00
Paul Robinson f96e21ad6d [DWARFv5] Update definitions to match published spec.
Some late additions to DWARF v5 were not in Dwarf.def; also one form
was redefined.  Add the new cases to relevant switches in different
parts of LLVM.  Replace DW_FORM_ref_sup with DW_FORM_ref_sup[4,8].

I did not add support for DW_FORM_strx3/addrx3 other that defining the
constants. We don't have any infrastructure to support these.

Differential Revision: http://reviews.llvm.org/D30664

llvm-svn: 297085
2017-03-06 22:20:03 +00:00
Jessica Paquette 596f483a5e [Outliner] Fixed Asan bot failure in r296418
Fixed the asan bot failure which led to the last commit of the outliner being reverted.
The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector
is no longer initialized using reserve but just a standard constructor.

llvm-svn: 297081
2017-03-06 21:31:18 +00:00
Krzysztof Parzyszek 5b8fae5edd [IfConversion] Only renormalize probabilities if branches are analyzable
If a block has non-analyzable branches, the listed successors don't need
to add up to one. For example, if a block has a conditional tail call,
that tail call will not have a corresponding successor in the successor
list, but will still be a possible branch.

Differential Revision: https://reviews.llvm.org/D30556

llvm-svn: 297054
2017-03-06 19:12:42 +00:00
Tim Northover 95b6d5f2b1 GlobalISel: don't emit degenerate G_INSERT instructions.
Before, we were producing G_INSERT instructions that were actually closer to a
cast or even a COPY when both input and output sizes are the same. This doesn't
really make sense and means that everything interpreting a G_INSERT also has to
handle all these kinds of casts.

So now we detect these degenerate cases and emit real casts instead.

llvm-svn: 297051
2017-03-06 19:04:17 +00:00
Tim Northover 81dafc1c88 GlobalISel: add buildUndef method to MachineIRBuilder. NFC.
llvm-svn: 297044
2017-03-06 18:36:40 +00:00
Tim Northover 75e0b91e59 GlobalISel: refactor legalization of G_INSERT.
Now that G_INSERT instructions can only insert one register, this code was
overly general. In another direction it didn't handle registers that crossed
split boundaries properly, which needed to be fixed.

llvm-svn: 297042
2017-03-06 18:23:04 +00:00
Sanjay Patel 7f7947bf41 [DAGCombiner] simplify div/rem-by-0
Refactoring of duplicated code and more fixes to follow.

This is motivated by the post-commit comments for r296699:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435182.html

Ie, we can crash if we're missing obvious simplifications like this that
exist in the IR simplifier or if these occur later than expected.

The x86 change for non-splat division shows a potential opportunity to improve
vector codegen: we assumed that since only one lane had meaningful results, we
should do the math in scalar. But that means moving back and forth from vector
registers.

llvm-svn: 297026
2017-03-06 16:36:42 +00:00
Sanjay Patel 6b029a5380 [DAG] fix formatting; NFC
llvm-svn: 297015
2017-03-06 15:27:57 +00:00
Sanjay Patel 5273afd4bb [DAG] fix typo in comment; NFC
llvm-svn: 297011
2017-03-06 15:07:43 +00:00
Dean Michael Berris 7e8eea429f [XRay] Allow logging the first argument of a function call.
Summary:
Functions with the "xray-log-args" attribute will have a special XRay sled kind
emitted, for compiler-rt to copy any call arguments to your logging handler.

For practical and performance reasons, only the first argument is supported, and
only up to 64 bits.

Reviewers: dberris

Reviewed By: dberris

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29702

llvm-svn: 296998
2017-03-06 06:48:56 +00:00
Simon Pilgrim 584d6d9d91 [SelectionDAG] Fix vector splitting for *_EXTEND_VECTOR_INREG instructions
Found by fuzz testing after rL296985 landed

llvm-svn: 296989
2017-03-05 15:52:18 +00:00
Simon Pilgrim 9f5c251d57 [X86][SSE] Lower 128-bit vectors to SIGN/ZERO_EXTEND_VECTOR_IN_REG ops
As described on PR31712, we miss a variety of legalization combines because we lower these to X86ISD::VSEXT/VZEXT despite them having the same functionality. This patch makes 128-bit (SSE41) SIGN/ZERO_EXTEND_VECTOR_IN_REG ops legal, adds the necessary tablegen plumbing and uses a helper 'getExtendInVec' to decide when to use SIGN/ZERO_EXTEND_VECTOR_IN_REG or VSEXT/VZEXT.

We're missing a couple of shuffle combines that will be added in a future patch for review.

Later patches can then support the AVX2 cases as a mixture of SIGN/ZERO_EXTEND and SIGN/ZERO_EXTEND_VECTOR_IN_REG, and then finally deal with the AVX512 cases.

Differential Revision: https://reviews.llvm.org/D30549

llvm-svn: 296985
2017-03-05 09:57:20 +00:00
Craig Topper 6ffc044b2f [DAGCombine] Use APInt::operator|(uint64_t) instead of creating a temporary APInt and calling APInt::Or. NFC
This is more efficient by itself. But this is prep for a future patch that may remove APInt::Or while making operator| support rvalue references similar to add/sub.

llvm-svn: 296981
2017-03-05 01:08:16 +00:00
Sanjay Patel 066f3208bf [DAGCombiner] allow transforming (select Cond, C +/- 1, C) to (add(ext Cond), C)
select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook.

This is part of the ongoing process to obsolete D24480.  The motivation is to 
canonicalize to select IR in InstCombine whenever possible, so we need to have a way to
undo that easily in codegen.
 
PowerPC is an obvious winner for this kind of transform because it has fast and complete 
bit-twiddling abilities but generally lousy conditional execution perf (although this might
have changed in recent implementations).

x86 also sees some wins, but the effect is limited because these transforms already mostly
exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86 
changes just shows that that code is a mess of special-case holes. We may be able to remove 
some of that logic now.

My guess is that other targets will want to enable this hook for most cases. The likely 
follow-ups would be to add value type and/or the constants themselves as parameters for the
hook. As the tests in select_const.ll show, we can transform any select-of-constants to 
math/logic, but the general transform for any 2 constants needs one more instruction 
(multiply or 'and').

ARM is one target that I think may not want this for most cases. I see infinite loops there
because it wants to use selects to enable conditionally executed instructions.

Differential Revision: https://reviews.llvm.org/D30537

llvm-svn: 296977
2017-03-04 19:18:09 +00:00
Florian Hahn 6406f98342 [legalize-types] Remove stale entries from SoftenedFloats.
Summary:
When replacing a SDValue, we should remove the replaced value from
SoftenedFloats (and possibly the other maps as well?).

When we revisit a Node because it needs analyzing again, we have to
remove all result values from SoftenedFloats (and possibly other maps?).

This fixes the fp128 test failures with expensive checks for X86.

I think we probably should also remove the values from the other maps
(PromotedIntegers and so on), let me know what you think.

Reviewers: baldrick, bogner, davidxl, ab, arsenm, pirama, chh, RKSimon

Reviewed By: chh

Subscribers: danalbert, wdng, srhines, hfinkel, sepavloff, llvm-commits

Differential Revision: https://reviews.llvm.org/D29265

llvm-svn: 296964
2017-03-04 12:00:35 +00:00
Eli Friedman 49f0220086 [MISched] Remove unused arguments. NFC.
llvm-svn: 296934
2017-03-04 00:42:55 +00:00
Matthias Braun ffe40dd69e RegAllocGreedy: Follow-up to r296722
We can now end up in situations where we initiate LiveIntervalUnion
queries with different SubRanges against the same register unit, so the
assert() no longer holds in all cases. Just recalculate now when we know
the cache is out of date.

llvm-svn: 296928
2017-03-03 23:27:20 +00:00
Tim Northover 3e6a7afd81 GlobalISel: constrain G_INSERT to inserting just one value per instruction.
It's much easier to reason about single-value inserts and no-one was actually
using the variadic variants before.

llvm-svn: 296923
2017-03-03 23:05:47 +00:00
Tim Northover bf017293af GlobalISel: add merge/unmerge nodes for legalization.
These are simplified variants of the current G_SEQUENCE and G_EXTRACT, which
assume the individual parts will be contiguous, homogeneous, and occupy the
entirity of the larger register. This makes reasoning about them much easer
since you only have to look at the first register being merged and the result
to know what the instruction is doing.

I intend to gradually replace all uses of the more complicated sequence/extract
with these (or single-element insert/extracts), and then remove the older
variants. For now we start with legalization.

llvm-svn: 296921
2017-03-03 22:46:09 +00:00
Matthias Braun a04d7ad851 RegisterCoalescer: Simplify subrange splitting code; NFC
- Use slightly better variable names / compute in a more direct way.

llvm-svn: 296905
2017-03-03 19:05:34 +00:00
Simon Pilgrim 6dfab414db Use APInt::setBits instead of OR'ing in a separate APInt::getBitsSet call
llvm-svn: 296886
2017-03-03 17:03:52 +00:00
Simon Pilgrim cf12b5e1a6 Use APInt::getOneBitSet instead of APInt::getBitsSet for sign bit mask creation
Avoids all the unnecessary extra bitrange creation/shift stages.

llvm-svn: 296879
2017-03-03 16:35:57 +00:00
Simon Pilgrim 10754abe7e Use APInt::getOneBitSet instead of APInt::getBitsSet for sign bit mask creation
Avoids all the unnecessary extra bitrange creation/shift stages.

llvm-svn: 296871
2017-03-03 14:25:46 +00:00
Simon Pilgrim b01bb3a1b2 Fix Wdocumentation warning
llvm-svn: 296866
2017-03-03 12:09:11 +00:00
Chandler Carruth ce52b80744 [SDAG] Revert r296476 (and r296486, r296668, r296690).
This patch causes compile times for some patterns to explode. I have
a (large, unreduced) test case that slows down by more than 20x and
several test cases slow down by 2x. I'm sending some of the test cases
directly to Nirav and following up with more details in the review log,
but this should unblock anyone else hitting this.

llvm-svn: 296862
2017-03-03 10:02:25 +00:00
Adrian Prantl ea8880b81f LiveDebugValues: Assume calls never clobber SP.
A call should never modify the stack pointer, but some backends are
not so sure about this and never list SP in the regmask. For the
purposes of LiveDebugValues we assume a call never clobbers SP. We
already have a similar workaround in DbgValueHistoryCalculator (which
we hopefully can retire soon).

This fixes the availabilty of local ASANified variables on AArch64.

rdar://problem/27757381

llvm-svn: 296847
2017-03-03 01:08:25 +00:00
Kyle Butt 1fa6030767 CodeGen: BlockPlacement: Precompute layout for chains of triangles.
For chains of triangles with small join blocks that can be tail duplicated, a
simple calculation of probabilities is insufficient. Tail duplication
can be profitable in 3 different ways for these cases:

1) The post-dominators marked 50% are actually taken 56% (This shrinks with
   longer chains)
2) The chains are statically correlated. Branch probabilities have a very
   U-shaped distribution.
   [http://nrs.harvard.edu/urn-3:HUL.InstRepos:24015805]
   If the branches in a chain are likely to be from the same side of the
   distribution as their predecessor, but are independent at runtime, this
   transformation is profitable. (Because the cost of being wrong is a small
   fixed cost, unlike the standard triangle layout where the cost of being
   wrong scales with the # of triangles.)
3) The chains are dynamically correlated. If the probability that a previous
   branch was taken positively influences whether the next branch will be
   taken
We believe that 2 and 3 are common enough to justify the small margin in 1.

The code pre-scans a function's CFG to identify this pattern and marks the edges
so that the standard layout algorithm can use the computed results.

llvm-svn: 296845
2017-03-03 01:00:22 +00:00
Taewook Oh 96c6415697 [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y)
Summary:
Currently, when 't1: i1 = setcc t2, t3, cc' followed by 't4: i1 = xor t1, Constant:i1<-1>' is folded into 't5: i1 = setcc t2, t3 !cc', SDLoc of newly created SDValue 't5' follows SDLoc of 't4', not 't1'. However, as the opcode of newly created SDValue is 'setcc', it make more sense to take DebugLoc from 't1' than 't4'. For the code below

```
extern int bar();
extern int baz();

int foo(int x, int y) {
  if (x != y)
    return bar();
  else
    return baz();
}
```

, following is the bitcode representation of 'foo' at the end of llvm-ir level optimization:

```
define i32 @foo(i32 %x, i32 %y) !dbg !4 {
entry:
  tail call void @llvm.dbg.value(metadata i32 %x, i64 0, metadata !9, metadata !11), !dbg !12
  tail call void @llvm.dbg.value(metadata i32 %y, i64 0, metadata !10, metadata !11), !dbg !13
  %cmp = icmp ne i32 %x, %y, !dbg !14
  br i1 %cmp, label %if.then, label %if.else, !dbg !16

if.then:                                          ; preds = %entry
  %call = tail call i32 (...) @bar() #3, !dbg !17
  br label %return, !dbg !18

if.else:                                          ; preds = %entry
  %call1 = tail call i32 (...) @baz() #3, !dbg !19
  br label %return, !dbg !20

return:                                           ; preds = %if.else, %if.then
  %retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ]
  ret i32 %retval.0, !dbg !21
}

!14 = !DILocation(line: 5, column: 9, scope: !15)
!16 = !DILocation(line: 5, column: 7, scope: !4)

```

As you can see, in 'entry' block, 'icmp' instruction and 'br' instruction have different debug locations. However, with current implementation, there's no distinction between debug locations of these two when they are lowered to asm instructions. This is because 'icmp' and 'br' become 'setcc' 'xor' and 'brcond' in SelectionDAG, where SDLoc of 'setcc' follows the debug location of 'icmp' but SDLOC of 'xor' and 'brcond' follows the debug location of 'br' instruction, and SDLoc of 'xor' overwrites SDLoc of 'setcc' when they are folded. This patch addresses this issue.

Reviewers: atrick, bogner, andreadb, craig.topper, aprantl

Reviewed By: andreadb

Subscribers: jlebar, mkuper, jholewinski, andreadb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29813

llvm-svn: 296825
2017-03-02 21:58:35 +00:00
Sanjay Patel 7884dcb788 [DAG] early exit to improve readability and formatting of visitMemCmpCall(); NFCI
llvm-svn: 296824
2017-03-02 21:56:43 +00:00
Kyle Butt 1393761e0c CodeGen: MachineBlockPlacement: Remove the unused outlining heuristic.
Outlining optional branches isn't a good heuristic, and it's never been
on by default. Remove it to clean things up.

llvm-svn: 296818
2017-03-02 21:44:24 +00:00
Zachary Turner d9dc2829ea [Support] Move Stream library from MSF -> Support.
After several smaller patches to get most of the core improvements
finished up, this patch is a straight move and header fixup of
the source.

Differential Revision: https://reviews.llvm.org/D30266

llvm-svn: 296810
2017-03-02 20:52:51 +00:00
Sanjay Patel 209b0f9aad [DAG] improve documentation comments; NFC
llvm-svn: 296808
2017-03-02 20:48:08 +00:00
Simon Pilgrim 7b227fecb5 Fix some Wdocumentation warnings
llvm-svn: 296783
2017-03-02 18:59:07 +00:00
Sanjay Patel fffa179837 [DAGCombiner] avoid assertion when folding binops with opaque constants
This bug was introduced with:
https://reviews.llvm.org/rL296699

There may be a way to loosen the restriction, but for now just bail out
on any opaque constant.

The tests show that opacity is target-specific. This goes back to cost
calculations in ConstantHoisting based on TTI->getIntImmCost().

llvm-svn: 296768
2017-03-02 17:18:56 +00:00
Sanjay Patel f7aba7ba22 fix typo in comment; NFC
llvm-svn: 296760
2017-03-02 16:37:24 +00:00
Serge Pavlov e2bf69715f Do not verify MachimeDominatorTree if it is not calculated
If dominator tree is not calculated or is invalidated, set corresponding
pointer in the pass state to nullptr. Such pointer value will indicate
that operations with dominator tree are not allowed. In particular, it
allows to skip verification for such pass state. The dominator tree is
not calculated if the machine dominator pass was skipped, it occures in
the case of entities with linkage available_externally.

The change fixes some test fails observed when expensive checks
are enabled.

Differential Revision: https://reviews.llvm.org/D29280

llvm-svn: 296742
2017-03-02 12:00:10 +00:00
Matthias Braun dbcf9e2ee4 LiveRegMatrix: Fix some subreg interference checks
Surprisingly, one of the three interference checks in LiveRegMatrix was
using the main live range instead of the apropriate subregister range
resulting in unnecessarily conservative results.

llvm-svn: 296722
2017-03-02 00:35:08 +00:00
Paul Robinson a94f76b18c Remove spurious use of LLVM_FALLTHROUGH (NFC)
llvm-svn: 296713
2017-03-01 23:59:11 +00:00
Amaury Sechet 71f511fd1e [DAGCombiner] mulhi + 1 never overflow.
Summary:
This can be used to optimize large multiplications after legalization.

Depends on D29565

Reviewers: mkuper, spatel, RKSimon, zvi, bkramer, aaboud, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29587

llvm-svn: 296711
2017-03-01 23:44:17 +00:00
Ahmed Bougacha 120ae22d70 [GlobalISel] Add a way for targets to enable GISel.
Until now, we've had to use -global-isel to enable GISel.  But using
that on other targets that don't support it will result in an abort, as we
can't build a full pipeline.
Additionally, we want to experiment with enabling GISel by default for
some targets: we can't just enable GISel by default, even among those
target that do have some support, because the level of support varies.

This first step adds an override for the target to explicitly define its
level of support.  For AArch64, do that using
a new command-line option (I know..):
  -aarch64-enable-global-isel-at-O=<N>
Where N is the opt-level below which GISel should be used.

Default that to -1, so that we still don't enable GISel anywhere.
We're not there yet!

While there, remove a couple LLVM_UNLIKELYs.  Building the pipeline is
such a cold path that in practice that shouldn't matter at all.

llvm-svn: 296710
2017-03-01 23:33:08 +00:00
Sanjay Patel 92938657a0 [DAGCombiner] fold binops with constant into select-of-constants
This is part of the ongoing attempt to improve select codegen for all targets and select 
canonicalization in IR (see D24480 for more background). The transform is a subset of what
is done in InstCombine's FoldOpIntoSelect().

I first noticed a regression in the x86 avx512-insert-extract.ll tests with a patch that 
hopes to convert more selects to basic math ops. This appears to be a general missing DAG
transform though, so I added tests for all standard binops in rL296621 
(PowerPC was chosen semi-randomly; it has scripted FileCheck support, but so do ARM and x86).

The poor output for "sel_constants_shl_constant" is tracked with:
https://bugs.llvm.org/show_bug.cgi?id=32105

Differential Revision: https://reviews.llvm.org/D30502

llvm-svn: 296699
2017-03-01 22:51:31 +00:00
Victor Leschuk d7bfa40ace [DebugInfo] [DWARFv5] Unique abbrevs for DIEs with different implicit_const values
Take DW_FORM_implicit_const attribute value into account when profiling
DIEAbbrevData.

Currently if we have two similar types with implicit_const attributes and
different values we end up with only one abbrev in .debug_abbrev section.
For example consider two structures: S1 with implicit_const attribute ATTR
and value VAL1 and S2 with implicit_const ATTR and value VAL2.
The .debug_abbrev section will contain only 1 related record:

[N] DW_TAG_structure_type       DW_CHILDREN_yes
        DW_AT_ATTR        DW_FORM_implicit_const  VAL1
        // ....

This is incorrect as struct S2 (with VAL2) will use abbrev record with VAL1.

With this patch we will have two different abbreviations here:

[N] DW_TAG_structure_type       DW_CHILDREN_yes
        DW_AT_ATTR        DW_FORM_implicit_const  VAL1
        // ....

[M] DW_TAG_structure_type       DW_CHILDREN_yes
        DW_AT_ATTR        DW_FORM_implicit_const  VAL2
        // ....

llvm-svn: 296691
2017-03-01 22:13:42 +00:00
Benjamin Kramer 0e429606b0 [DAGCombiner] Remove non-ascii character and reflow comment.
llvm-svn: 296690
2017-03-01 22:10:43 +00:00
Matthias Braun 173e11439e LIU:::Query: Query LiveRange instead of LiveInterval; NFC
- We only need the information from the base class, not the additional
  details in the LiveInterval class.
- Spread more `const`
- Some code cleanup

llvm-svn: 296684
2017-03-01 21:48:12 +00:00
Reid Kleckner f7c0980c10 Elide argument copies during instruction selection
Summary:
Avoids tons of prologue boilerplate when arguments are passed in memory
and left in memory. This can happen in a debug build or in a release
build when an argument alloca is escaped.  This will dramatically affect
the code size of x86 debug builds, because X86 fast isel doesn't handle
arguments passed in memory at all. It only handles the x86_64 case of up
to 6 basic register parameters.

This is implemented by analyzing the entry block before ISel to identify
copy elision candidates. A copy elision candidate is an argument that is
used to fully initialize an alloca before any other possibly escaping
uses of that alloca. If an argument is a copy elision candidate, we set
a flag on the InputArg. If the the target generates loads from a fixed
stack object that matches the size and alignment requirements of the
alloca, the SelectionDAG builder will delete the stack object created
for the alloca and replace it with the fixed stack object. The load is
left behind to satisfy any remaining uses of the argument value. The
store is now dead and is therefore elided. The fixed stack object is
also marked as mutable, as it may now be modified by the user, and it
would be invalid to rematerialize the initial load from it.

Supersedes D28388

Fixes PR26328

Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29668

llvm-svn: 296683
2017-03-01 21:42:00 +00:00
Matthias Braun 702f55bb4a LIU::Query: Remove always false member+getter; NFC
llvm-svn: 296675
2017-03-01 21:02:52 +00:00
Nemanja Ivanovic b223cfabcc Improve scheduling with branch coalescing
This patch adds a MachineSSA pass that coalesces blocks that branch
on the same condition.

Committing on behalf of Lei Huang.

Differential Revision: https://reviews.llvm.org/D28249

llvm-svn: 296670
2017-03-01 20:29:34 +00:00
Nirav Dave 0a4703b5ec [DAG] Prevent Stale nodes from entering worklist
Add check that deleted nodes do not get added to worklist. This can
occur when a node's operand is simplified to an existing node.

This fixes PR32108.

Reviewers: jyknight, hfinkel, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30506

llvm-svn: 296668
2017-03-01 20:19:38 +00:00
Paul Robinson d4f1c487f3 Alphabetize some cases (NFC)
llvm-svn: 296655
2017-03-01 19:01:47 +00:00
Paul Robinson 91d74813a6 [DWARF] Default lower bound should respect requested DWARF version.
DWARF may define a default lower-bound for arrays in languages defined
in a particular DWARF version.  But the logic to suppress an
unnecessary lower-bound attribute was looking at the hard-coded
default DWARF version, not the version that had been requested.

Also updated the list with all languages defined in DWARF v5.

Differential Revision: http://reviews.llvm.org/D30484

llvm-svn: 296652
2017-03-01 18:32:37 +00:00
Artur Pilipenko e1b2d31468 [DAGCombiner] Support {a|s}ext, {a|z|s}ext load nodes in load combine
Resubmit r295336 after the bug with non-zero offset patterns on BE targets is fixed (r296336).

Support {a|s}ext, {a|z|s}ext load nodes as a part of load combine patters.

Reviewed By: filcab

Differential Revision: https://reviews.llvm.org/D29591

llvm-svn: 296651
2017-03-01 18:12:29 +00:00
Ahmed Bougacha 20b3e9a835 [CodeGen] Remove dead FastISel code after SDAG emitted a tailcall.
When SDAGISel (top-down) selects a tail-call, it skips the remainder
of the block.

If, before that, FastISel (bottom-up) selected some of the (no-op) next
few instructions, we can end up with dead instructions following the
terminator (selected by SDAGISel).

We need to erase them, as we know they aren't necessary (in addition to
being incorrect).

We already do this when FastISel falls back on the tail-call itself.
Also remove the FastISel-emitted code if we fallback on the
instructions between the tail-call and the return.

llvm-svn: 296552
2017-03-01 00:43:42 +00:00
Ahmed Bougacha 67d1c7c3c2 [GlobalISel] Replace all combined G_EXTRACT uses.
Iterating on the use-list we're modifying doesn't work: after the first
iteration, the use-list iterator will point to a MachineOperand
referencing the new register.  This caused us to skip the other uses to
replace.

Instead, use MRI.replaceRegWith(), which accounts for this behavior.

llvm-svn: 296551
2017-03-01 00:43:39 +00:00
Paul Robinson 3443575c03 Add missing module/license header. NFC.
llvm-svn: 296550
2017-03-01 00:14:42 +00:00