Commit Graph

23270 Commits

Author SHA1 Message Date
Victor Leschuk 6aedf785c5 Remove duplicate code
llvm-svn: 311675
2017-08-24 17:02:38 +00:00
Victor Leschuk 471579b52e Add missing break in switch
llvm-svn: 311673
2017-08-24 16:57:10 +00:00
Matt Arsenault d664315ae8 IPRA: Don't assume called function is first call operand
Fixes not finding the called global for AMDGPU
call pseudoinstructions, which prevented IPRA
from doing much.

llvm-svn: 311637
2017-08-24 07:55:15 +00:00
Matt Arsenault 00459e4a06 IPRA: Exit early on functions without calls
llvm-svn: 311636
2017-08-24 07:55:13 +00:00
Wei Ding a131d3fb29 Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Differential Revision: http://reviews.llvm.org/D36335

llvm-svn: 311629
2017-08-24 04:18:24 +00:00
Hans Wennborg c39ec95d88 [DAG] Fix Node Replacement in PromoteIntBinOp
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.

Fixes PR34137.

Landing on behalf of Nirav.

Differential Revision: https://reviews.llvm.org/D36581

llvm-svn: 311623
2017-08-24 01:08:27 +00:00
Adrian Prantl 7db6b5e2b3 Retire the llvm.dbg.mir hack after r311594.
llvm-svn: 311610
2017-08-23 22:02:36 +00:00
Aditya Nandakumar efd8a84cd5 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

llvm-svn: 311596
2017-08-23 20:45:48 +00:00
Reid Kleckner 950567aac4 Attempt to fix the BUILD_SHARED_LIBS build after the DIExpression change
llvm-svn: 311595
2017-08-23 20:39:35 +00:00
Reid Kleckner 6d353348e5 Parse and print DIExpressions inline to ease IR and MIR testing
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.

This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.

See also PR22780, for making DIExpression not be an MDNode.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37075

llvm-svn: 311594
2017-08-23 20:31:27 +00:00
Lei Huang 0cb591fc4c Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass.  Branch coalescing utilizes
the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets.

Differential Revision : https: // reviews.llvm.org/D32776

llvm-svn: 311588
2017-08-23 19:25:04 +00:00
Dean Michael Berris 0884b73220 [XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text
Summary:
This change achieves two things:

  - Redefine the Custom Event handling instrumentation points emitted by
    the compiler to not require dynamic relocation of references to the
    __xray_CustomEvent trampoline.

  - Remove the synthetic reference we emit at the end of a function that
    we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER
    associated with the section where the function is defined.

To achieve the custom event handling change, we've had to introduce the
concept of sled versioning -- this will need to be supported by the
runtime to allow us to understand how to turn on/off the new version of
the custom event handling sleds. That change has to land first before we
change the way we write the sleds.

To remove the synthetic reference, we rely on a relatively new linker
feature that preserves the sections that are associated with each other.
This allows us to limit the effects on the .text section of ELF
binaries.

Because we're still using absolute references that are resolved at
runtime for the instrumentation map (and function index) maps, we mark
these sections write-able. In the future we can re-define the entries in
the map to use relative relocations instead that can be statically
determined by the linker. That change will be a bit more invasive so we
defer this for later.

Depends on D36816.

Reviewers: dblaikie, echristo, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36615

llvm-svn: 311525
2017-08-23 04:49:41 +00:00
Matthias Braun 8426d1342d Add test case for r311511
This also changes the TailDuplicator to be configured explicitely
pre/post regalloc rather than relying on the isSSA() flag. This was
necessary to have `llc -run-pass` work reliably.

llvm-svn: 311520
2017-08-23 03:17:59 +00:00
Matthias Braun 55bc9b3f9e TargetInstrInfo: Change duplicate() to work on bundles.
Adds infrastructure to clone whole instruction bundles rather than just
single instructions. This fixes a bug where tail duplication would
unbundle instructions while cloning.

This should unbreak the "Clang Stage 1: cmake, RA, with expensive checks
enabled" build on greendragon. The bot broke with r311139 hitting this
pre-existing bug.

A proper testcase will come next.

llvm-svn: 311511
2017-08-22 23:56:30 +00:00
Craig Topper 35189d5221 [SelectionDAG] Make ISD::isConstantSplatVector always return an element sized APInt.
This partially reverts r311429 in favor of making ISD::isConstantSplatVector do something not confusing. Turns out the only other user of it was also having to deal with the weird property of it returning a smaller size.

So rather than continue to deal with this quirk everywhere, just make the interface do something sane.

Differential Revision: https://reviews.llvm.org/D37039

llvm-svn: 311510
2017-08-22 23:54:13 +00:00
Jonas Devlieghere a680a8f5f8 [Debug info] Add new DbgValues after looping over DAG
I was contacted by Jesper Antonsson from Ericsson who ran into problems
with r311181 in their test suites with for an out-of-tree target.
Because of the latter I don't have a reproducer, but we definitely don't
want to modify the data structure on which we are iterating inside the
loop.

llvm-svn: 311466
2017-08-22 16:28:07 +00:00
Renato Golin c070c73d5e [ARM] Avoid creating duplicate ANDs in SelectionDAG
When expanding a BRCOND into a BR_CC, do not create an AND 1
if one already exists.

Review: D36705

Patch by Joel Galenson <jgalenson@google.com>

llvm-svn: 311447
2017-08-22 11:02:45 +00:00
Sjoerd Meijer e0c933f5d6 [SelectionDAG] Add getNode debug messages
This adds debug messages to various functions that create new SDValue nodes.
This is e.g. useful to have during legalization, as otherwise it can prints
legalization info of nodes that did not appear in the dumps before.

Differential Revision: https://reviews.llvm.org/D36984

llvm-svn: 311444
2017-08-22 10:43:51 +00:00
Craig Topper b49f0893b2 [X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.

This patch just adds a flag to prevent the APInt from changing width.

Fixes PR34271.

Differential Revision: https://reviews.llvm.org/D36996

llvm-svn: 311429
2017-08-22 05:40:17 +00:00
Quentin Colombet 4056e80719 [RegAlloc] Make sure live-ranges reflect the state of the IR when removing them
When removing a live-range we used to not touch them making debug
prints harder to read because the IR was not matching what the
live-ranges information was saying.

This only affects debug printing and allows to put stronger asserts in
the code (see r308906 for instance).

llvm-svn: 311401
2017-08-21 22:56:18 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Jatin Bhateja 6b4c205685 [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 311255
2017-08-19 18:08:59 +00:00
Jatin Bhateja 66f7958e91 Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

llvm-svn: 311252
2017-08-19 17:59:58 +00:00
Jatin Bhateja 6f0d0d23b0 Merge branch 'arcpatch-D35788'
llvm-svn: 311247
2017-08-19 17:00:04 +00:00
Jatin Bhateja 1c56863739 Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

llvm-svn: 311246
2017-08-19 16:40:06 +00:00
Jatin Bhateja 313f97dd84 Extension of shuffle vector pattern detection, updating post rebase.
llvm-svn: 311242
2017-08-19 15:58:36 +00:00
Adrian Prantl 2116dd360a Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

llvm-svn: 311217
2017-08-19 01:15:06 +00:00
Jonas Devlieghere e101b07a1d [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

(re-commit)

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311181
2017-08-18 18:07:00 +00:00
Craig Topper e3edd9c9be [DAGCombiner] Fix bad comment that had immediate values swapped from the code and what they need to be to make sense. NFC
llvm-svn: 311144
2017-08-18 04:52:46 +00:00
Geoff Berry bd47e8a4f7 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

llvm-svn: 311142
2017-08-18 01:43:11 +00:00
Richard Smith c0541dfa3e Increase tail dup threshold for -O3 from 3 to 4.
We see a modest performance improvement from this slightly higher tail dup threshold.

Differential Revision: https://reviews.llvm.org/D36775

llvm-svn: 311139
2017-08-17 23:38:41 +00:00
Geoff Berry 51f52c4fca Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311135
2017-08-17 23:06:55 +00:00
Eugene Zelenko 6e07bfd0d9 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 311124
2017-08-17 21:26:39 +00:00
Jonas Devlieghere 30756da212 Revert "[Debug info] Transfer DI to fragment expressions for split integer values."
This reverts commit r311102.

llvm-svn: 311111
2017-08-17 17:58:33 +00:00
Jonas Devlieghere 622fedc001 [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

Differential Revision: https://reviews.llvm.org/D36805

llvm-svn: 311102
2017-08-17 17:06:48 +00:00
Adrian Prantl 6a57daad81 Improve line debug info when translating a CaseBlock to SDNodes.
The SelectionDAGBuilder translates various conditional branches into
CaseBlocks which are then translated into SDNodes. If a conditional
branch results in multiple CaseBlocks only the first CaseBlock is
translated into SDNodes immediately, the rest of the CaseBlocks are
put in a queue and processed when all LLVM IR instructions in the
basic block have been processed.

When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder
is queried for the current LLVM IR instruction and the resulting
SDNodes are annotated with the debug info of the current
instruction (if it exists and has debug metadata).

When the deferred CaseBlocks are processed, the SelectionDAGBuilder
does not have a current LLVM IR instruction, and the resulting SDNodes
will not have any debuginfo. As DwarfDebug::beginInstruction() outputs
a .loc directive for the first instruction in a labeled
block (typically the case for something coming from a CaseBlock) this
tends to produce a line-0 directive.

This patch changes the handling of CaseBlocks to store the current
instruction's debug info into the CaseBlock when it is created (and the
SelectionDAGBuilder knows the current instruction) and to always use
the stored debug info when translating a CaseBlock to SDNodes.

Patch by Frej Drejhammar!

Differential Revision: https://reviews.llvm.org/D36671

llvm-svn: 311097
2017-08-17 16:57:13 +00:00
Simon Pilgrim 8be9f4af4f [DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c
llvm-svn: 311083
2017-08-17 13:03:34 +00:00
Jonas Paulsson 57a705d9d0 [SystemZ, MachineScheduler] Improve post-RA scheduling.
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

llvm-svn: 311072
2017-08-17 08:33:44 +00:00
Elad Cohen 124d32829c [SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651

llvm-svn: 311071
2017-08-17 08:06:36 +00:00
Serguei Katkov 9e5604dbe1 [CGP] Fix the rematerialization of gc.relocates
If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.

Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.

The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.

Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462

llvm-svn: 311067
2017-08-17 05:48:30 +00:00
Geoff Berry 4e38e02e6f Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

llvm-svn: 311062
2017-08-17 04:04:11 +00:00
Geoff Berry 87f8d25150 [MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

llvm-svn: 311038
2017-08-16 20:50:01 +00:00
Balaram Makam c5698befb6 Revert "MachineInstr: Reason locally about some memory objects before going to AA."
r310825 caused the clang-ppc64le-linux-lnt bot to go red
(http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712)
because of a test-suite failure of
SingleSource/UnitTests/2003-07-09-SignedArgs

This reverts commit 0028f6a87224fb595a1c19c544cde9b003035996.

llvm-svn: 311008
2017-08-16 14:17:43 +00:00
Quentin Colombet 647b482c1e [VirtRegRewriter] Properly model the register liveness on undef subreg definition
Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.

The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.

E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1

Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
                                                      ; is believed to come
                                                      ; from the previous
                                                      ; instruction

Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107

Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
                                                      ; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>

The bug has been here forever, it got exposed recently because the register
scavenger got smarter.

Fixes llvm.org/PR34107

llvm-svn: 310979
2017-08-16 00:17:05 +00:00
Jessica Paquette 95c1107f4c [MachineOutliner] Only outline candidates of length >= 2
Since we don't factor in instruction lengths into outlining calculations
right now, it's never the case that a candidate could have length < 2.

Thus, we should quit early when we see such candidates.

llvm-svn: 310894
2017-08-14 22:57:41 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
Matt Arsenault 81da0d45f8 IPRA: Allow target to enable IPRA by default
llvm-svn: 310876
2017-08-14 19:54:47 +00:00
Matt Arsenault f9273c81d6 IPRA: Run RegUsageInfoPropagate much later
This was running immediately after isel, before
isel pseudos were even expanded which is really
unreasonable. Move this to before pre-reglloc
passes in case some other pre-regalloc pass wants to
use the updated regmask info.

Fixes one of the reasons IPRA doesn't do anything on
AMDGPU currently. Tests will be included with future
patch after a few more are fixed.

llvm-svn: 310875
2017-08-14 19:54:45 +00:00
Amaury Sechet 9c529b6be3 [DAGCombine] Do not try to deduplicate commutative operations if both operand are the same.
Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33840

llvm-svn: 310832
2017-08-14 11:44:03 +00:00
Elad Cohen 6a9edda356 [SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))
into vextract(vNiX,Idx) when creating vextract with getNode().
This case appeared in AVX512 after fixing pr33349 in r310552.

Differential revision: https://reviews.llvm.org/D36571

llvm-svn: 310828
2017-08-14 10:49:45 +00:00
Balaram Makam d9f53414de MachineInstr: Reason locally about some memory objects before going to AA.
This addresses a FIXME in MachineInstr::mayAlias.

llvm-svn: 310825
2017-08-14 09:41:40 +00:00
Elad Cohen 3a90a0c10d Revert "[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)"
This reverts commit r310782.

llvm-svn: 310822
2017-08-14 09:06:00 +00:00
Craig Topper 2251ef95a3 [X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

llvm-svn: 310793
2017-08-13 17:29:07 +00:00
Simon Pilgrim 5a86f0e717 [DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Reapplied with fix to only work with simple value types.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 310782
2017-08-12 17:43:25 +00:00
Sanjay Patel 169dae70a6 [x86] use more shift or LEA for select-of-constants (2nd try)
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.

We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the 
   push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a 
   post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
   that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310717
2017-08-11 15:44:14 +00:00
Nirav Dave 0a48e5d506 Improve handling of insert_subvector of bitcast values
Fix insert_subvector / extract_subvector merges of bitcast values.

Reviewers: efriedma, craig.topper, RKSimon

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D34571

llvm-svn: 310711
2017-08-11 13:21:41 +00:00
Nirav Dave d1b3f09faa [X86][DAG] Switch X86 Target to post-legalized store merge
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.

Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.

Reviewers: craig.topper, efriedma, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34559

llvm-svn: 310710
2017-08-11 13:21:35 +00:00
Simon Pilgrim 83cf3a29b5 [DAGCombiner] Remove shuffle support from simplifyShuffleMask
rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs.

Removing support until I can triage the problem

llvm-svn: 310699
2017-08-11 08:37:00 +00:00
Mikael Holmen 8b10680922 [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*
Summary:
This fixes PR32721 in IfConvertTriangle and possible similar problems in
IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond.

In PR32721 we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

The edge updating code and branch probability updating code is now pushed
into MergeBlocks() which allows us to share the same update logic between
more callsites. This lets us remove several dependencies on analyzeBranch
and completely eliminate RemoveExtraEdges.

One thing that showed up with this patch was that IfConversion sometimes
left a successor with 0% probability even if there was no branch or
fallthrough to the successor.

One such example from the test case ifcvt_bad_zero_prob_succ.mir. The
indirect branch tBRIND can only jump to bb.1, but without the patch we
got:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000), %bb.2(0x00000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

  bb.2:

There is no way to jump from bb.1 to bb2, but still there is a 0% edge
from bb.1 to bb.2.

With the patch applied we instead get the expected:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

Since bb.2 had no predecessor at all, it was removed.

Several testcases had to be updated due to this since the removed
successor made the "Branch Probability Basic Block Placement" pass
sometimes place blocks in a different order.

Finally added a couple of new test cases:

* PR32721_ifcvt_triangle_unanalyzable.mir:
  Regression test for the original problem dexcribed in PR 32721.

* ifcvt_triangleWoCvtToNextEdge.mir:
  Regression test for problem that caused a revert of my first attempt
  to solve PR 32721.

* ifcvt_simple_bad_zero_prob_succ.mir:
  Test case showing the problem where a wrong successor with 0% probability
  was previously left.

* ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir
  Very simple test cases for the simple and (forked) diamond cases
  involving unanalyzable branches that can be nice to have as a base if
  wanting to write more complicated tests.

Reviewers: iteratee, MatzeB, grosser, kparzysz

Reviewed By: kparzysz

Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34099

llvm-svn: 310697
2017-08-11 06:57:08 +00:00
Nirav Dave f8556ad48f Revert "[DAG] Cleanup unused nodes after store merge. NFCI."
This reverts commit r310648 which causes an unexpected assertion failure

llvm-svn: 310659
2017-08-10 21:03:36 +00:00
Nirav Dave 4d28c0ff4f [DAG] Relax type restriction for store merge
Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary.

Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34569

llvm-svn: 310655
2017-08-10 19:52:45 +00:00
Nirav Dave 99d9d24553 [DAG] Cleanup unused nodes after store merge. NFCI.
llvm-svn: 310648
2017-08-10 18:53:14 +00:00
Taewook Oh f5040b9685 Make .file directive to have basename only
Summary:
Currently LLVM puts directory along with the filename in .file directive, but this behavior doesn't match gcc. There's a no clear description about which one is right (https://sourceware.org/binutils/docs/as/File.html#File), but one document (https://sourceware.org/gdb/current/onlinedocs/stabs/ELF-Linker-Relocation.html) suggests that STT_FILE symbol in elf file is expected to have basename only, which should have a same sting file .file directive according to (https://docs.oracle.com/cd/E26502_01/html/E28388/eoiyg.html).

This also affects badly on the build system that uses hashing, as the directory info could be differnt from developer to developer even when they're working on same file.

Reviewers: pcc, mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36018

llvm-svn: 310642
2017-08-10 18:17:11 +00:00
Krzysztof Parzyszek bea30c6286 Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.

Differential Revision: https://reviews.llvm.org/D36160

llvm-svn: 310619
2017-08-10 16:17:32 +00:00
Nirav Dave 06242a99ce [DAG] Rewrite expression. NFC.
llvm-svn: 310608
2017-08-10 15:29:33 +00:00
Nirav Dave 926e2d39bf [X86] Keep dependencies when constructing loads in combineStore
Summary:
Preserve chain dependecies between old and new loads constructed to
prevent loads from reordering below later stores.

Fixes PR34088.

Reviewers: craig.topper, spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36528

llvm-svn: 310604
2017-08-10 15:12:32 +00:00
Guy Blank 136b543745 [SelectionDAG] Allow constant folding for implicitly truncating BUILD_VECTOR nodes.
In FoldConstantArithmetic, handle BUILD_VECTOR nodes that do implicit truncation on the elements.

This is similar to what is done in FoldConstantVectorArithmetic.

Differential Revision:
https://reviews.llvm.org/D36506

llvm-svn: 310593
2017-08-10 14:09:50 +00:00
Elad Cohen 22ba97a0a6 [SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.

When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.

The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.

This fixes pr33349.

Differential revision: https://reviews.llvm.org/D36511

llvm-svn: 310552
2017-08-10 07:44:23 +00:00
David Blaikie 76fb649b0d Reduce variable scope by moving declaration into if clause
llvm-svn: 310506
2017-08-09 18:34:18 +00:00
Nirav Dave 6110d3ad00 [DAG] Explicitly cleanup merged load values during store merge. NFCI.
llvm-svn: 310474
2017-08-09 13:37:07 +00:00
Serguei Katkov 6ea2e81cf6 [ImplicitNullCheck] Fix the bug when dependent instruction accesses memory
It is possible that dependent instruction may access memory.
In this case we must reject optimization because the memory change will
be visible in null handler basic block. So we will execute an instruction which
we must not execute if check fails.

Reviewers: sanjoy, reames
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36392

llvm-svn: 310443
2017-08-09 05:17:02 +00:00
Reid Kleckner e2e82061f9 [codeview] Emit nested enums and typedefs from classes
Previously we limited ourselves to only emitting nested classes, but we
need other kinds of types as well.

This fixes the Visual Studio STL visualizers, so that users can
visualize std::string and other objects.

llvm-svn: 310410
2017-08-08 20:30:14 +00:00
Nirav Dave 8a813cf646 [DAG] Introduce peekThroughBitcast function. NFCI.
llvm-svn: 310405
2017-08-08 20:01:18 +00:00
Nirav Dave 515116d7c2 [DAG] Update comments. NFC.
llvm-svn: 310404
2017-08-08 19:52:19 +00:00
Simon Pilgrim 91b7b991d4 [DAGCombiner] simplifyShuffleMask - handle UNDEF inputs from shuffles as well as BUILD_VECTOR
Minor extension to D36393

llvm-svn: 310372
2017-08-08 16:10:33 +00:00
Simon Pilgrim ef44228acb [DAGCombiner] Simplify shuffle mask index if the referenced input element is UNDEF
Fixes one of the cases in PR34041.

Differential Revision: https://reviews.llvm.org/D36393

llvm-svn: 310344
2017-08-08 11:03:30 +00:00
Sanjay Patel 807f92b8ff [x86] revert r310208 to investigate test-suite failures (PR34105 / PR34097)
llvm-svn: 310264
2017-08-07 15:47:48 +00:00
Nirav Dave 3d3bde7682 [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Relanding after case to insert explicit truncation as necessary.

Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 310256
2017-08-07 14:07:49 +00:00
Guy Blank 5ca01695f7 [SelectionDAG] reset NewNodesMustHaveLegalTypes flag between basic blocks
The NewNodesMustHaveLegalTypes flag is set to false at the beginning of CodeGenAndEmitDAG, and set to true after legalizing types.
But before calling CodeGenAndEmitDAG we build the DAG for the basic block.
So for the first basic block NewNodesMustHaveLegalTypes would be 'false' during the SDAG building, and for all other basic blocks it would be 'true'.

This patch sets the flag to false before SDAG building each basic block.

Differential Revision:
https://reviews.llvm.org/D33435

llvm-svn: 310239
2017-08-07 05:51:14 +00:00
Sanjay Patel a923c2ee95 [x86] use more shift or LEA for select-of-constants
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d

For this patch, I'm enhancing an existing x86 transform that uses fake multiplies 
(they always become shl/lea) to avoid cmov or branching. The current code misses 
cases where we have a negative constant and a positive constant, so this is just 
trying to plug that hole.

The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start 
with a select in IR, create a select DAG node, convert it into a sext, convert it 
back into a select, and then lower it to sext machine code.

Some notes about the test diffs:

1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. I 
   think we could avoid the push/pop in some cases if we used 'movzbl %al' instead of an xor on 
   a different reg? That's a post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if 
   that's a regression, but I think those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so I don't think we actually care about these diffs.
5. sbb.ll - This shows a win for what I think is a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.

Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.

Differential Revision: https://reviews.llvm.org/D35340

llvm-svn: 310208
2017-08-06 16:27:07 +00:00
Matt Arsenault b94972cb82 IPRA: Don't crash on null getCallPreservedMask
Kernels aren't callable, so they don't have a call preserved mask.

llvm-svn: 310172
2017-08-05 07:50:18 +00:00
Kyle Butt 74f61dd8ef BlockPlacement: add a flag to force cold block outlining w/o a profile.
NFC.

llvm-svn: 310129
2017-08-04 21:13:41 +00:00
Nico Weber b24df62bb6 Revert r310058, it caused PR34073.
llvm-svn: 310118
2017-08-04 20:24:13 +00:00
Quentin Colombet c7652e22d7 [GlobalISel] Remove a stall comment in CMake.
Thanks to Diana Picus <diana.picus@linaro.org> for noticing.

NFC

llvm-svn: 310114
2017-08-04 20:15:41 +00:00
Marcello Maggioni 8de4bbdaa5 [MachineOperand] Add ChangeToTargetIndex method. NFC
Differential Revision: https://reviews.llvm.org/D36301

llvm-svn: 310083
2017-08-04 18:24:09 +00:00
Simon Pilgrim 5c63586489 [DAGCombiner] Extending pattern detection for vector shuffle.
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

llvm-svn: 310058
2017-08-04 12:46:35 +00:00
Eric Christopher 52854dcd34 Fix typo.
llvm-svn: 309997
2017-08-03 22:41:12 +00:00
Matt Arsenault a52391f2db DAG: Provide access to Pass instance from SelectionDAG
This allows accessing an analysis pass during lowering.

llvm-svn: 309991
2017-08-03 21:54:00 +00:00
Quentin Colombet 250e050a50 [GlobalISel] Make GlobalISel a non-optional library.
With this change, the GlobalISel library gets always built. In
particular, this is not possible to opt GlobalISel out of the build
using the LLVM_BUILD_GLOBAL_ISEL variable any more.

llvm-svn: 309990
2017-08-03 21:52:25 +00:00
Nirav Dave 3fc1c2365c [DAG] Allow merging of stores of vector loads
Remove restriction disallowing merging of stores vector loads into
larger store of larger vector load.

Reviewers: RKSimon, efriedma, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36158

llvm-svn: 309951
2017-08-03 15:51:20 +00:00
Robert Lougher 10f740df4d [LiveDebugVariables] Use lexical scope to trim debug value live intervals
The debug value live intervals computed by Live Debug Variables may extend
beyond the range of the debug location's lexical scope. In this case,
splitting of an interval can result in an interval outside of the scope being
created, causing extra unnecessary DBG_VALUEs to be emitted. To prevent this,
trim the intervals to the lexical scope.

This resolves PR33730.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D35953

llvm-svn: 309933
2017-08-03 11:54:02 +00:00
Simon Dardis 51296593a8 [SelectionDAG] Resolve PR33978.
rL306209 taught SelectionDAG how to add the dereferenceable flag when
expanding memcpy and memmove. The fix however contained a nit where
the offset + size was constructed as an APInt of PointerSize rather
than PointerSizeInBits.

This lead to isDereferenceableAndAlignedPointer() get truncated values or
values which would be sign extended within that function leading to
incorrect results.

Thanks to Alex Crichton for reporting the issue!

This resolves PR33978.

Reviewers: inouehrs

Differential Revision: https://reviews.llvm.org/D36236

llvm-svn: 309930
2017-08-03 09:38:46 +00:00
Sameer AbuAsal c8befe687f [RegisterCoalescer] Add wrapper for Erasing Instructions
Summary:
      To delete an instruction the coalescer needs to call eraseFromParent()
      on the MachineInstr, insert it in the ErasedInstrs list and update the
      Live Ranges structure. This patch re-factors the code to do all that in
      one function. This will also fix cases where previous code wasn't
      inserting deleted instructions in the ErasedList.

Reviewers: qcolombet, kparzysz

Reviewed By: qcolombet

Subscribers: MatzeB, llvm-commits, qcolombet

Differential Revision: https://reviews.llvm.org/D36204

llvm-svn: 309915
2017-08-03 02:41:17 +00:00
Rafael Espindola 79e238afee Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

llvm-svn: 309911
2017-08-03 02:16:21 +00:00
Hiroshi Inoue 0bd906ec8f [StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)
This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments.

Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp

// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.

But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.

This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.

This patch fixes PR33928.

llvm-svn: 309849
2017-08-02 18:16:32 +00:00
Adrian Prantl bd6d291c59 Assert that the offset of a DBG_VALUE is always 0. (NFC)
llvm-svn: 309834
2017-08-02 17:19:13 +00:00
Adrian Prantl 5577363a2c Remove the unused Offset field from MachineLocation (NFC)
rdar://problem/33580047

llvm-svn: 309831
2017-08-02 17:07:38 +00:00
Nirav Dave e44032f7e7 [DAG] Improve candidate pruning in store merge failure case. NFCI
During store merge we construct a sorted list of consecutive store
candidates and consider subsequences for merging into a single
store. For each subsequence we check if the stored value type is legal
the merged store would have valid and fast and if the constructed
value to be stored is valid. The only properties that affect this
check between subsequences is the size of the subsequence, the
alignment of the first store, the alignment of the stored load value
(when merging stores-of-loads), and whether the merged value is a
constant zero.

If we do not find a viable mergeable subsequence starting from the
first store of length N, we know that a subsequence starting at a
later store of length N will also fail unless the new store's
alignment, the new load's alignment (if we're merging store-of-loads),
or we've dropped stores of nonzero value and could construct a merged
stores of zero (for merging constants).

As a result if we fail to find a valid subsequence starting from the
first store we can safely skip considering subsequences that start
with subsequent stores unless one of the above properties is
true. This significantly (2x) improves compile time in some
pathological cases.

Reviewers: RKSimon, efriedma, zvi, spatel, waltl

Subscribers: grandinj, llvm-commits

Differential Revision: https://reviews.llvm.org/D35901

llvm-svn: 309830
2017-08-02 16:35:58 +00:00
Adrian Prantl 61b39b2aec Remove unused includes of MachineLocation.h (NFC)
llvm-svn: 309824
2017-08-02 15:32:18 +00:00
Adrian Prantl 2049c0d734 Remove unreachable code. (NFC)
MachineLocation::getOffset() always returns 0.

rdar://problem/33580047

llvm-svn: 309823
2017-08-02 15:22:17 +00:00
Diana Picus d5a00b0ff6 [MIR] Print target-specific constant pools
This should enable us to test the generation of target-specific constant
pools, e.g. for ARM:

constants:
 - id:              0
   value:           'g(GOT_PREL)-(LPC0+8-.)'
   alignment:       4
   isTargetSpecific: true

I intend to use this to test PIC support in GlobalISel for ARM.

This is difficult to test outside of that context, since the existing
MIR tests usually rely on parser support as well, and that seems a bit
trickier to add. We could try to add a unit test, but the setup for that
seems rather convoluted and overkill.

We do test however that the parser reports a nice error when
encountering a target-specific constant pool.

Differential Revision: https://reviews.llvm.org/D36092

llvm-svn: 309806
2017-08-02 11:09:30 +00:00
Nirav Dave 15894f8dec [DAG] Refactor store merge subexpressions. NFC.
Distribute various expressions across ifs.

llvm-svn: 309777
2017-08-02 01:08:38 +00:00
Matt Arsenault acc5e82b0e DAG: Undo and->or combine with FrameIndexes
This pattern shows up when lowering byval copies on AMDGPU.

The byval object access is split into 4-byte chunks, adding a
constant offset to the FixedStack base. When some of the offsets
turn into ors, this prevents combining the constant offsets.

This makes it not apparent that the object is there when matching
addressing modes, so it ends up using a scratch wave offset
relative access and the lengthy frame index expansion for that.

llvm-svn: 309775
2017-08-02 00:43:42 +00:00
Adrian Prantl 83ca4fc7bc Update LiveDebugValues to generate DIExpressions for spill offsets
instead of using the deprecated offset field of DBG_VALUE.

This has no observable effect on the generated DWARF, but the
assembler comments will look different.

rdar://problem/33580047

llvm-svn: 309773
2017-08-02 00:16:56 +00:00
Adrian Prantl aac78ce47e Use helper function instead of manually constructing DBG_VALUEs (NFC)
rdar://problem/33580047

llvm-svn: 309757
2017-08-01 22:37:35 +00:00
Adrian Prantl 032d2381bf Remove PrologEpilogInserter's usage of DBG_VALUE's offset field
In the last half-dozen commits to LLVM I removed code that became dead
after removing the offset parameter from llvm.dbg.value gradually
proceeding from IR towards the backend. Before I can move on to
DwarfDebug and friends there is one last side-called offset I need to
remove:  This patch modifies PrologEpilogInserter's use of the
DBG_VALUE's offset argument to use a DIExpression instead. Because the
PrologEpilogInserter runs at the Machine level I had to play a little
trick with a named llvm.dbg.mir node to get the DIExpressions to print
in MIR dumps (which print the llvm::Module followed by the
MachineFunction dump).

I also had to add rudimentary DwarfExpression support to CodeView and
as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo
was supposed to give up on variables with complex DIExpressions, but
would fail to do so for fragments, which are also modeled as
DIExpressions).

With this last holdover removed we will have only one canonical way of
representing offsets to debug locations which will simplify the code
in DwarfDebug (and future versions of CodeViewDebug once it starts
handling more complex expressions) and make it easier to reason about.

This patch is NFC-ish: All test case changes are for assembler
comments and the binary output does not change.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D36125

llvm-svn: 309751
2017-08-01 21:45:24 +00:00
Nirav Dave dcc5afaad9 [DAG] Factor out common expressions. NFC.
llvm-svn: 309740
2017-08-01 20:30:52 +00:00
Reid Kleckner 29c675247d [DebugInfo] Don't turn dbg.declare into DBG_VALUE for static allocas
Summary:
We already have information about static alloca stack locations in our
side table. Emitting instructions for them is inefficient, and it only
happens when the address of the alloca has been materialized within the
current block, which isn't often.

Reviewers: aprantl, probinson, dblaikie

Subscribers: jfb, dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D36117

llvm-svn: 309729
2017-08-01 19:45:09 +00:00
Nirav Dave 35dd1ac29c Pull out VectorNumElements value. NFC.
llvm-svn: 309719
2017-08-01 18:19:56 +00:00
Nirav Dave 27a6605bdc Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."
This reverts commit r309680 which appears to be raising an assertion
in the test-suite.

llvm-svn: 309717
2017-08-01 18:09:25 +00:00
Sanjay Patel 4dbdd470a8 [CGP] use narrower types in memcmp expansion when possible
This only affects very small memcmp on x86 for now, but it
will become more important if we allow vector-sized load and
compares.

llvm-svn: 309711
2017-08-01 17:24:54 +00:00
Nirav Dave f54c8370e5 [DAG] Convert extload check to equivalent type check. NFC.
Replace check with check that consuming store has the same type.

llvm-svn: 309708
2017-08-01 17:19:41 +00:00
Nirav Dave b5a0af6b6e [DAG] Move extload check in store merge. NFC.
Move candidate check from later check to initial candidate check.

llvm-svn: 309698
2017-08-01 16:00:47 +00:00
Manoj Gupta 7ebbe655ed [X86] Fix a crash in FEntryInserter Pass.
Summary:
FEntryInserter pass unconditionally derefs the first Instruction
in the first Basic Block. The pass crashes when the first
BasicBlock is empty. Fix the crash by not dereferencing the basic
Block iterator. This fixes an issue observed when building Linux kernel
4.4 with clang.

Fixes PR33971.

Reviewers: hfinkel, niravd, dblaikie

Reviewed By: niravd

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D35979

llvm-svn: 309694
2017-08-01 15:39:12 +00:00
David Blaikie c2be863681 DebugInfo: Update flag description that'd been copypasted from another
Post-commit review feedback from Paul Robinson on r309630. Thanks Paul!

llvm-svn: 309685
2017-08-01 14:50:50 +00:00
Nirav Dave b5cb48c6ae [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Summary:
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 309680
2017-08-01 13:45:35 +00:00
Andrew V. Tischenko d56595184b Support itineraries in TargetSubtargetInfo::getSchedInfoStr - Now if the given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries.
Differential Revision https://reviews.llvm.org/D35997

llvm-svn: 309666
2017-08-01 09:15:43 +00:00
Hiroshi Inoue b9417dbd48 [StackColoring] Update AliasAnalysis information in stack coloring pass
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp

// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.

But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.

This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.

llvm-svn: 309651
2017-08-01 03:32:15 +00:00
Eli Friedman 37f41d1e7c [ScheduleDAG] Don't schedule node with physical register interference
https://reviews.llvm.org/D31536 didn't really solve the problem it was
trying to solve; it got rid of the assertion failure, but we were still
scheduling the DAG incorrectly (mixing together instructions from
different calls), leading to a MachineVerifier failure.

In order to schedule the DAG correctly, we have to make sure we don't
schedule a node which should be blocked by an interference. Fix
ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node
like that.

The added call to FindAvailableNode() is the key change here; this makes
sure we don't try to schedule a call while we're in the middle of
scheduling a different call. I'm not sure this is the right approach; in
particular, I'm not sure how to prove we don't end up with an infinite
loop of repeatedly backtracking.

This also reverts the code change from D31536. It doesn't do anything
useful: we should never schedule an ADJCALLSTACKDOWN unless we've
already scheduled the corresponding ADJCALLSTACKUP.

Differential Revision: https://reviews.llvm.org/D33818

llvm-svn: 309642
2017-08-01 00:28:40 +00:00
David Blaikie 038e28a5a7 DebugInfo: Put range base specifier entry functionality behind a flag
Chromium's gold build seems to have trouble with this (gold produces
errors) - not sure if it's gold that's not coping with the valid
representation, or a bug in the implementation in LLVM, etc.

llvm-svn: 309630
2017-07-31 21:48:42 +00:00
Reid Kleckner 2de471dca9 [codeview] Ignore DBG_VALUEs when choosing a BB start source loc
When the first instruction of a basic block has no location (consider a
LEA materializing the address of an alloca for a call), we want to start
the line table for the block with the first valid source location in the
block.  We need to ignore DBG_VALUE instructions during this scan to get
decent line tables.

llvm-svn: 309628
2017-07-31 21:03:08 +00:00
Quentin Colombet 15f6ffbf7c [TargetPassConfig] Feature generic options to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.

NFC.

Differential Revision: https://reviews.llvm.org/D30913

llvm-svn: 309599
2017-07-31 18:24:07 +00:00
Sanjay Patel fea731a4aa [CGP] use subtract or subtract-of-cmps for result of memcmp expansion
As noted in the code comment, transforming this in the other direction might require 
a separate transform here in CGP given the block-at-a-time DAG constraint.

Besides that theoretical motivation, there are 2 practical motivations for the 
subtract-of-cmps form:

1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). 
   There is discussion about canonicalizing IR to the select form 
   ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), 
   so we probably need to add DAG transforms for those patterns anyway, but this improves the 
   memcmp output without waiting for that step.

2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert
   that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided
   if we choose to enable that.

Differential Revision: https://reviews.llvm.org/D34904

llvm-svn: 309597
2017-07-31 18:08:24 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Florian Hahn 1b09a37c07 Extend ifndef to printDebugLoc.
GCC7 did not warn about that, but Clang does.

llvm-svn: 309573
2017-07-31 16:29:00 +00:00
Florian Hahn 94b8a87c8e Extend ifdefs to more unused helper functions.
This fixes a buildbot failure with -Werror introduced by r309553

llvm-svn: 309572
2017-07-31 16:11:43 +00:00
Simon Dardis 8930b4a049 [SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765

llvm-svn: 309561
2017-07-31 14:06:58 +00:00
Florian Hahn 6b3216aad8 Guard print() functions only used by dump() functions.
Summary:
Since  r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.

Reviewers: MatzeB

Reviewed By: MatzeB

Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D35949

llvm-svn: 309553
2017-07-31 10:07:49 +00:00
David Blaikie 4dd663752d DebugInfo: Fix r309526, ensure resetting base address selection entries are used
Missed the resetting base address selections when going from a base
address version to zero base address for non-base-addressed entries.

llvm-svn: 309529
2017-07-31 00:18:24 +00:00
David Blaikie 89c81a0b91 DebugInfo: Use base address selection entries in debug_ranges to reduce relocations
(from comments in the test)
Group ranges in a range list that apply to the same section and use a base
address selection entry to reduce the number of relocations to one reloc per
section per range list. DWARF5 debug_rnglist will be more efficient than this
in terms of relocations, but it's still better than one reloc per entry in a
range list.

This is an object/executable size tradeoff - shrinking objects, but growing
the linked executable. In one large binary tested, total object size (not just
debug info) shrank by 16%, entirely relocation entries. Linked executable
grew by 4%. This was with compressed debug info in the objects, uncompressed
in the linked executable. Without compression in the objects, the win would be
smaller (the growth of debug_ranges itself would be more significant).

llvm-svn: 309526
2017-07-30 22:10:00 +00:00
Simon Pilgrim 718cb0ea62 [SelectionDAG][X86] CombineBT - more aggressively determine demanded bits
This patch is in 2 parts:

1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.

2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.

Differential Revision: https://reviews.llvm.org/D35896

llvm-svn: 309486
2017-07-29 14:50:25 +00:00
Jessica Paquette d87f54493d [MachineOutliner] NFC: Change IsTailCall to a call class + frame class
This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

llvm-svn: 309475
2017-07-29 02:55:46 +00:00
Adrian Prantl 359846fe1c Remove the unused offset field from LiveDebugValues (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309455
2017-07-28 23:25:51 +00:00
Adrian Prantl b2811e50f9 Remove the unused offset field from LiveDebugVariables (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309451
2017-07-28 23:06:50 +00:00
Adrian Prantl 8b9bb534a1 Remove the unused offset from DBG_VALUE (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309450
2017-07-28 23:00:45 +00:00
Adrian Prantl d92ac5a259 Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309449
2017-07-28 22:46:20 +00:00
Adrian Prantl 1b63dc54b0 Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309446
2017-07-28 22:36:55 +00:00
Adrian Prantl a617576bb1 Remove the unused dbg.value offset from SelectionDAG (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309436
2017-07-28 21:27:35 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Reid Kleckner 9be82c3169 Fix conditional tail call branch folding when both edges are the same
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:

BB#1: derived from LLVM BB %entry
    Live Ins: %EFLAGS
    Predecessors according to CFG: BB#0
        JE_1 <BB#5>, %EFLAGS<imp-use,kill>
        JMP_1 <BB#5>

BB#5: derived from LLVM BB %sw.epilog
    Predecessors according to CFG: BB#1
        TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...

We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.

This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.

Fixes PR33980

llvm-svn: 309422
2017-07-28 19:48:40 +00:00
Jessica Paquette 4602c3437c [MachineOutliner] NFC: Comment tidying
The comment on describing the suffix tree had some pruning
stuff that was out of date in it.

Also fixed some typos.

llvm-svn: 309365
2017-07-28 05:59:30 +00:00
Jessica Paquette 809d708b8a [MachineOutliner] NFC: Split up getOutliningBenefit
This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

llvm-svn: 309356
2017-07-28 03:21:58 +00:00
David Blaikie 89daf77a11 DebugInfo: Consider a CU containing only local imported entities to be 'empty'
This can come up in ThinLTO & wastes space & makes degenerate IR.

As per the added FIXME, ultimately, local imported entities should hang
off the function and that way the imported entity list on the CU can be
tested for emptiness like all the other CU lists.

(function-attached local imported entities are probably also the best
path forward for fixing how imported entities are handled both in
cross-module use (currently, while ThinLTO preserves the imported
entities, they would not get used at the imported inlined location -
only in the abstract origin that appears in the partial CU created by
the import (which isn't emitted under Fission due to cross-CU
limitations there)) and to reduce the number of points where imported
entities are emitted (they're currently emitted into every inlined
instance, concrete instance, and abstract origin - they should only go
in teh abstract origin if there is one, otherwise in the concrete
instance - but this requires lots of delayed handling and wiring up,
same as abstract variables & subprograms))

llvm-svn: 309354
2017-07-28 03:06:25 +00:00
Jessica Paquette 78681be2a4 [MachineOutliner] Cleanup: move findCandidates out of suffix tree
Doing some cleanup in preparation for some functional changes.
This commit moves findCandidates out of the suffix tree and into the
MachineOutliner class. This is much easier to follow, and removes
the burden of candidate choice from the suffix tree.

It also adds a couple FIXMEs and simplifies building outlined function
names.

llvm-svn: 309334
2017-07-27 23:24:43 +00:00
Simon Pilgrim ac84850ea6 [SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)
Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.

llvm-svn: 309302
2017-07-27 18:15:54 +00:00
Adam Nemet 6374331a8c [OptRemark] Allow streaming of 64-bit integers
llvm-svn: 309293
2017-07-27 16:54:13 +00:00
Simon Pilgrim 64a795c5f5 [SelectionDAG] Avoid repeated calls to getNumOperands in for loop. NFCI.
llvm-svn: 309283
2017-07-27 15:42:21 +00:00
Adrian Prantl 960e7663f3 remove redundant check
llvm-svn: 309280
2017-07-27 15:24:20 +00:00
Simon Pilgrim 0d543b5921 [SelectionDAG] Tidyup mask creation. NFCI.
Assign all concat elements to UNDEF and then just replace the first element, instead of copying everything individually.

llvm-svn: 309277
2017-07-27 15:08:53 +00:00
David Blaikie 2195e13676 DebugInfo: Ensure imported entities at the top level of an inlined function don't cause degenerate concrete definitions
Local imported entities at the top level of a subprogram were being
handled differently from those in nested scopes - that different
handling would cause pseudo concrete out-of-line definitions to be
created (but without any of their attributes, nor an abstract_origin) in
the case where there was no real concrete definition.

These local imported entities also only appeared in the concrete
definition where those imported entities in nested scopes appear in all
cases (abstract, concrete, and inlined). This change at least makes top
level case handle the same as the others - though there's a FIXME to
improve this to /only/ emit them into the abstract origin (though this
requires more plumbing - like the abstract subprogram and variable
handling that must defer population until the end of the unit to
discover if there is an abstract origin, or only a standalone concrete
definition).

llvm-svn: 309237
2017-07-27 00:06:53 +00:00