Commit Graph

136742 Commits

Author SHA1 Message Date
Sjoerd Meijer 58156715b4 MachineLoop: add methods findLoopControlBlock and findLoopPreheader
This adds two new utility functions findLoopControlBlock and findLoopPreheader
to MachineLoop and MachineLoopInfo. These functions are refactored and taken
from the Hexagon target as they are target independent; thus this is intendend to
be a non-functional change.

Differential Revision: https://reviews.llvm.org/D22959

llvm-svn: 278661
2016-08-15 08:22:42 +00:00
James Molloy 9a3c82f5cf [SimplifyCFG] Rewrite SinkThenElseCodeToEnd
The new version has several advantages:
  1) IMSHO it's more readable and neater
  2) It handles loads and stores properly
  3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.

With this change we can now finally sink load-modify-store idioms such as:

    if (a)
      return *b += 3;
    else
      return *b += 4;

    =>

    %z = load i32, i32* %y
    %.sink = select i1 %a, i32 5, i32 7
    %b = add i32 %z, %.sink
    store i32 %b, i32* %y
    ret i32 %b

When this works for switches it'll be even more powerful.

llvm-svn: 278660
2016-08-15 08:04:56 +00:00
Prakhar Bahuguna a305a435a6 [Thumb] Validate branch target for CBZ/CBNZ instructions.
Summary:
The assembler currently does not check the branch target for CBZ/CBNZ
instructions, which only permit branching forwards with a positive offset. This
adds validation for the branch target to ensure negative PC-relative offsets are
not encoded into the instruction, whether specified as a literal or as an
assembler symbol.

Reviewers: rengolin, t.p.northover

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D23312

llvm-svn: 278659
2016-08-15 07:57:44 +00:00
James Molloy 196ad0823e [LSR] Don't try and create post-inc expressions on non-rotated loops
If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs!

Motivating testcase:

    void f(float *a, float *b, float *c, int n) {
      while (n-- > 0)
        *c++ = *a++ + *b++;
    }

It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage.

llvm-svn: 278658
2016-08-15 07:53:03 +00:00
Craig Topper f774de6d54 [X86] PADDUSB/W instructions should be commutable.
llvm-svn: 278654
2016-08-15 06:31:57 +00:00
Craig Topper 80c8b80919 [X86] Mark some of the X86 SDNodes as commutative.
llvm-svn: 278653
2016-08-15 04:47:30 +00:00
Craig Topper dbc387cfc9 [X86] X86ISD::FANDN is not commutative or associative.
llvm-svn: 278652
2016-08-15 04:47:28 +00:00
David Majnemer 3b47a5a562 [ScopedNoAliasAA] collectMDInDomain should be a free function
collectMDInDomain doesn't use any class members, making it a free
function is not a functional change.

llvm-svn: 278651
2016-08-15 03:56:06 +00:00
David Majnemer 8b8869f8ef [ScopedNoAliasAA] Only collect noalias nodes if we have alias.scope nodes
No functional change is intended.

llvm-svn: 278646
2016-08-15 02:23:50 +00:00
David Majnemer ddc7ab26fc [ScopedNoAliasAA] Replace !ScopeNodes.size() with ScopeNodes.empty()
No functional change is intended.

llvm-svn: 278645
2016-08-15 02:23:48 +00:00
David Majnemer c77a1390de Revert "[ScopedNoAliasAA] Remove an unneccesary set"
This reverts commit r278641.  I'm not sure why but this has upset the
multistage builders...

llvm-svn: 278644
2016-08-15 02:23:46 +00:00
David Majnemer 5ec9c58f13 [ScopedNoAliasAA] Remove an unneccesary set
We are trying to prove that one group of operands is a subset of
another.  We did this by populating two Sets and determining that every
element within one was inside the other.

However, this is unnecessary.  We can simply construct a single set and
test if each operand is within it.

llvm-svn: 278641
2016-08-15 00:13:04 +00:00
Sanjay Patel 52fe9ae990 [InstCombine] add test for missing vector icmp fold
llvm-svn: 278639
2016-08-14 22:56:46 +00:00
Sanjay Patel 7e57b00274 [InstCombine] add tests for vector icmp folds
llvm-svn: 278637
2016-08-14 22:44:10 +00:00
Sanjay Patel 8554f70c07 [InstCombine] add test for potentially missing vector icmp fold
llvm-svn: 278636
2016-08-14 22:30:07 +00:00
Sanjay Patel beebe05af1 [InstCombine] add test for missing vector icmp fold
llvm-svn: 278635
2016-08-14 22:29:27 +00:00
Sanjay Patel ba1f9fbddc [InstCombine] add tests for missing vector icmp folds
llvm-svn: 278634
2016-08-14 22:28:50 +00:00
Sanjay Patel f6559404d5 [InstCombine] remove unnecessary function attributes from tests
llvm-svn: 278633
2016-08-14 21:48:21 +00:00
Sanjay Patel b44ca3bfa9 [InstCombine] add tests for missing vector icmp folds
llvm-svn: 278632
2016-08-14 21:36:22 +00:00
Sanjay Patel bbb3dffd0a [InstCombine] add test for missing vector icmp fold
llvm-svn: 278631
2016-08-14 21:05:08 +00:00
Sanjay Patel 66a3457a4c [InstCombine] add test for missing vector icmp fold
llvm-svn: 278630
2016-08-14 20:39:42 +00:00
Craig Topper 37e8c5443c [AVX-512] Mark VPMADDWD as commutable to match SSE/AVX version.
llvm-svn: 278629
2016-08-14 17:57:22 +00:00
Craig Topper c677e97dff [AVX-512] Add masked commutable floating point max/min instructions to folding tables.
llvm-svn: 278628
2016-08-14 17:57:19 +00:00
Craig Topper 29fbdc309a [AVX-512] Add masked logical operations to memory folding tables.
llvm-svn: 278627
2016-08-14 17:57:16 +00:00
Igor Breger 505f2cc468 [AVX512] Fix VFPCLASSSD/VFPCLASSSS intrinsic lowering. The i1 result should be zero extended according to SPEC.
Differential Revision: http://reviews.llvm.org/D23489

llvm-svn: 278626
2016-08-14 13:58:57 +00:00
Igor Breger 6fc00b0acf autogenerate checks
llvm-svn: 278624
2016-08-14 09:34:39 +00:00
Igor Breger 8672408db0 [AVX512] Fix insertelement i1 lowering.
1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 )
2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE.

Differential Revision: http://reviews.llvm.org/D23347

llvm-svn: 278623
2016-08-14 05:25:07 +00:00
Saleem Abdulrasool 98541b09f4 Revert "gold: add a cast to appease std::max NFC"
This was fixed differently by Teresa and this should no longer be needed.

llvm-svn: 278622
2016-08-14 05:07:20 +00:00
Diana Picus 68be1eb885 Revert "CodeGen: If Convert blocks that would form a diamond when tail-merged."
This reverts commit r278287.

This commit broke the clang-cmake-thumbv7-a15-full-sh bot.
See https://llvm.org/bugs/show_bug.cgi?id=28949

llvm-svn: 278621
2016-08-14 02:10:18 +00:00
Diana Picus 35ccf53e75 Revert "Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough."
This reverts commit r278288.

r278287 broke the clang-cmake-thumbv7-a15-full-sh bot.
Revert this so we can get to r278287.

llvm-svn: 278620
2016-08-14 02:10:12 +00:00
Sanjoy Das 35459f0e34 [IRCE] Change variable grouping; NFC
llvm-svn: 278619
2016-08-14 01:04:50 +00:00
Sanjoy Das 2143447c73 [IRCE] Create llvm::Loop instances for cloned out loops
llvm-svn: 278618
2016-08-14 01:04:46 +00:00
Sanjoy Das 7a18a238c6 [IRCE] Don't iterate on loops that were cloned out
IRCE has the ability to further version pre-loops and post-loops that it
created, but this isn't useful at all.  This change teaches IRCE to
leave behind some metadata in the loops it creates (by cloning the main
loop) so that these new loops are not re-processed by IRCE.

Today this bug is hidden by another bug -- IRCE does not update LoopInfo
properly so the loop pass manager does not re-invoke IRCE on the loops
it split out.  However, once the latter is fixed the bug addressed in
this change causes IRCE to infinite-loop in some cases (e.g. it splits
out a pre-loop, a pre-pre-loop from that, a pre-pre-pre-loop from that
and so on).

llvm-svn: 278617
2016-08-14 01:04:36 +00:00
Sanjoy Das 43fdc54303 [IRCE] Add better DEBUG diagnostic; NFC
NFC meaning IRCE should not _do_ anything different, but
-debug-only=irce will be a little friendlier.

llvm-svn: 278616
2016-08-14 01:04:31 +00:00
Mehdi Amini a71002e7f1 Fix bitcode auto-upgrade when using bitcode lazy loading
The auto-upgrade path could be called before the VST (global
names) was fully parsed, and thus intrinsic names were not
available and the autoupgrade logic could not operate.

Fix link failures with ThinLTO.

This is a recommit of r278610 with a different fix.

llvm-svn: 278615
2016-08-14 00:01:27 +00:00
Ron Lieberman 822ee88ab8 Fix unsupported relocation type R_HEX_6_X' for symbol .rodata
LowerTargetConstantPool is not properly setting the TargetFlag to indicate
desired relocation. Coding error, the offset parameter was omitted, so the
TargetFlag was used as the offset, and the TargetFlag defaulted to zero.

This only affects -fpic compilation, and only those items created in a
Constant Pool, for example a vector of constants. Halide ran into this issue.

llvm-svn: 278614
2016-08-13 23:41:11 +00:00
Mehdi Amini 466a64e298 Revert "Fix bitcode auto-upgrade when using bitcode lazy loading"
This reverts commit r278610. Tests are broken

llvm-svn: 278613
2016-08-13 23:39:14 +00:00
Sanjoy Das 1b1272f515 [IRCE] Fix test case; NFC
The (negative) test case is supposed to check that IRCE does not muck
with range checks it cannot handle, not that it does the right thing in
the absence of profiling information.

llvm-svn: 278612
2016-08-13 23:36:40 +00:00
Sanjoy Das 2a2f14d7ab [IRCE] Be resilient in the face of non-simplified loops
Loops containing `indirectbr` may not be in simplified form, even after
running LoopSimplify.  Reject then gracefully, instead of tripping an
assert.

llvm-svn: 278611
2016-08-13 23:36:35 +00:00
Mehdi Amini e62aaf2303 Fix bitcode auto-upgrade when using bitcode lazy loading
The auto-upgrade path could be called before the VST (global
names) was fully parsed, and thus intrinsic names were not
available and the autoupgrade logic could not operate.

Fix link failures with ThinLTO.

llvm-svn: 278610
2016-08-13 23:31:53 +00:00
Mehdi Amini 8c629ecf3a Revert "Revert "Invariant start/end intrinsics overloaded for address space""
This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530.

llvm-svn: 278609
2016-08-13 23:31:24 +00:00
Mehdi Amini 164ac651da Revert "Invariant start/end intrinsics overloaded for address space"
This reverts commit r276447.

llvm-svn: 278608
2016-08-13 23:27:32 +00:00
Sanjoy Das f2b7bafae4 [IRCE] Use dyn_cast instead of explicit isa/cast; NFC
llvm-svn: 278607
2016-08-13 22:00:12 +00:00
Sanjoy Das d1d62a1354 [IRCE] Use range-for; NFC
llvm-svn: 278606
2016-08-13 22:00:09 +00:00
Mehdi Amini fa0f96b083 [ADT] Add a reserve() method to DenseSet as well as an insert() for R-value
Recommit 278600 with some fixes to make the test more robust.

llvm-svn: 278604
2016-08-13 20:42:19 +00:00
Mehdi Amini bf0010934b Revert "[ADT] Add a reserve method to DenseSet as well as an insert() for R-value"
This reverts commit r278600. The unittest does not pass on MSVC, there is
an extra move. Investigating how to make it more robust.

llvm-svn: 278603
2016-08-13 20:14:39 +00:00
Yaron Keren 782788b7a1 Limit DenseMap::setNumEntries input to 1<<31, in accordance with the 31 bits allocated to NumEntries.
std::numeric_limits<int>::max() may be something else than 1<<31.

llvm-svn: 278602
2016-08-13 19:46:31 +00:00
Mehdi Amini 7079efe1ce Add missing REQUIRES in sancov/print_coverage_pcs.test: it requires aarch64 as well now
llvm-svn: 278601
2016-08-13 19:44:02 +00:00
Mehdi Amini d866d8a03f [ADT] Add a reserve method to DenseSet as well as an insert() for R-value
llvm-svn: 278600
2016-08-13 19:40:13 +00:00
Sanjay Patel 08c876673e [x86] add tests to show missed 64-bit immediate merging
Tests are slightly modified versions of those written by
Sunita Marathe in D23391.

llvm-svn: 278599
2016-08-13 18:42:14 +00:00
Aditya Kumar f24939b1f4 Test commit
llvm-svn: 278598
2016-08-13 11:56:50 +00:00
Craig Topper 8c372a31b7 [X86] Add a check of isCommutable at the top of X86InstrInfo::findCommutedOpIndices. Most callers don't check if the instruction is commutable before calling.
This saves us the trouble of ending up in the default of the switch and having to determine if this is an FMA or not.

llvm-svn: 278597
2016-08-13 06:48:44 +00:00
Craig Topper eafdbecc44 [AVX-512] Add isCommutable to scalar FMA3 instructions.
llvm-svn: 278596
2016-08-13 06:48:41 +00:00
Craig Topper 5f2441d8f3 [AVX-512] Add commutable flags to 132 form FMA3 instructions.
llvm-svn: 278595
2016-08-13 06:48:39 +00:00
Craig Topper e5115aa4ca [X86] Remove patterns for (vzmovl (insert_subvector undef, (scalar_to_vector))) as the (vzmovl VR256) pattern has higher priority. NFC
llvm-svn: 278594
2016-08-13 06:02:19 +00:00
Craig Topper 3f8126e6fa [AVX-512] Remove an AddedComplexity that was prioritizing basic vzmovl patterns over more complex ones that produce better code.
llvm-svn: 278593
2016-08-13 05:43:20 +00:00
Craig Topper 600685d510 [AVX-512] Add patterns to support VZEXT_MOVL from 512-bit vectors with 64-bit and 32-bit elements.
Fixes PR28961.

llvm-svn: 278592
2016-08-13 05:33:12 +00:00
Teresa Johnson 1eca6bc6a7 [PM] Port LoopDataPrefetch to new pass manager
Summary:
Refactor the existing support into a LoopDataPrefetch implementation
class and a LoopDataPrefetchLegacyPass class that invokes it.
Add a new LoopDataPrefetchPass for the new pass manager that utilizes
the LoopDataPrefetch implementation class.

Reviewers: mehdi_amini

Subscribers: sanjoy, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23483

llvm-svn: 278591
2016-08-13 04:11:27 +00:00
Matt Arsenault c1ebd82ebe AMDGPU: Fix not estimating MBB operand sizes correctly
llvm-svn: 278590
2016-08-13 01:43:54 +00:00
Matt Arsenault 3cc1e0066d AMDGPU: Fix missing test for addressing mode with odd offsets
Add test if the constant offset looks unaligned.

llvm-svn: 278589
2016-08-13 01:43:51 +00:00
Matt Arsenault 44f6d694b3 AMDGPU/R600: Remove macros
llvm-svn: 278588
2016-08-13 01:43:46 +00:00
Hans Wennborg 0dd9ed1d45 Fix more dereferenced end() iterators after r278532
llvm-svn: 278587
2016-08-13 01:12:49 +00:00
Pete Cooper 35b00d5d9e Constify ValueTracking. NFC.
Almost all of the method here are only analysing Value's as opposed to
mutating them.  Mark all of the easy ones as const.

llvm-svn: 278585
2016-08-13 01:05:32 +00:00
Sanjoy Das 3502511548 [IndVars] Ignore (s|z)exts that don't extend the induction variable
`IVVisitor::visitCast` used to have the invariant that if the
instruction it was passed was a sext or zext instruction, the result of
the instruction would be wider than the induction variable.  This is no
longer true after rL275037, so this change teaches `IndVarSimplify` s
implementation of `IVVisitor::visitCast` to work with the relaxed
invariant.

A corresponding change to SimplifyIndVar to preserve the said invariant
after rL275037 would also work, but given how `IVVisitor::visitCast` is
spelled (no indication of said invariant), I figured the current fix is
cleaner.

Fixes PR28935.

llvm-svn: 278584
2016-08-13 00:58:31 +00:00
Eugene Zelenko 3e3a057c20 Fix some Clang-tidy modernize-use-using and Include What You Use warnings.
Differential revision: https://reviews.llvm.org/D23478

llvm-svn: 278583
2016-08-13 00:50:41 +00:00
Kostya Serebryany f5bb42c081 [libFuzzer] mention one more trophie in LLVM
llvm-svn: 278582
2016-08-13 00:12:32 +00:00
Justin Lebar d1675aadf6 [LSV] Use a set rather than an ArraySlice at the end of getVectorizablePrefix. NFC
Summary: This avoids a small O(n^2) loop.

Reviewers: asbirlea

Subscribers: mzolotukhin, llvm-commits, arsenm

Differential Revision: https://reviews.llvm.org/D23473

llvm-svn: 278581
2016-08-13 00:04:12 +00:00
Justin Lebar 222ceff289 [LSV] Use OrderedBasicBlock instead of rolling it ourselves. NFC
Summary:
In getVectorizablePrefix, this is less efficient (because we have to
iterate over the BB twice), but boy is it simpler.  Given how much
trouble we've had here, I think the simplicity gain is worthwhile.

In reorder(), this is actually more efficient, as
DominatorTree::dominates iterates over the BB from the beginning when
the two instructions are in the same BB.

Reviewers: asbirlea

Subscribers: arsenm, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23472

llvm-svn: 278580
2016-08-13 00:04:08 +00:00
Justin Lebar cf56e92c50 Minor comment fix ("generate" --> "generates").
llvm-svn: 278578
2016-08-12 23:58:19 +00:00
Hans Wennborg 2d87ccfd58 X86: Fix another dereferenced end() iterator after r278532
llvm-svn: 278577
2016-08-12 23:35:59 +00:00
Dominic Chen 4a9b99ee92 [WebAssembly] Re-enable disabled debug value test
Summary:
This test was resulting in asan/valgrind failures due to undefined
DWARF register mappings for WebAssembly, and was disabled in r278495.
These have been resolved.

Reviewers: sunfish, dschuff

Subscribers: bkramer, llvm-commits, jfb

Differential Revision: https://reviews.llvm.org/D23459

llvm-svn: 278576
2016-08-12 23:14:18 +00:00
Haicheng Wu 7c4535d1e7 Reapply [BranchFolding] Restrict tail merging loop blocks after MBP
Fixed a bug in the test case.

To fix PR28104, this patch restricts tail merging to blocks that belong to the
same loop after MBP.

llvm-svn: 278575
2016-08-12 23:13:38 +00:00
Dominic Chen 2868fa171a Avoid accessing LLVM/DWARF register mappings if undefined
Summary:
If the backend does not define LLVM/DWARF register mappings, the associated
variables are undefined since the map initializer is called by auto-generated
TableGen routines. This patch initializes the pointers and sizes to nullptr
and zero, respectively, and checks that they are valid before searching
for a mapping.

Reviewers: grosbach, dschuff

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23458

llvm-svn: 278574
2016-08-12 23:12:59 +00:00
Tim Shen c9c0d2dcb5 [LoopVectorize] Detect loops in the innermost loop before creating InnerLoopVectorizer
InnerLoopVectorizer shouldn't handle a loop with cycles inside the loop
body, even if that cycle isn't a natural loop.

Fixes PR28541.

Differential Revision: https://reviews.llvm.org/D22952

llvm-svn: 278573
2016-08-12 22:47:13 +00:00
Duncan P. N. Exon Smith 69b0650548 X86: Stop dereferencing end() in X86FrameLowering::emitEpilogue
On a Windows build of Chromium, r278532 (up to r278539)
X86FrameLowering::emitEpilogue because it wasn't wary enough of the
return of MachineBasicBlock::getFirstTerminator.  Guard all the uses
here.

Note that r278532 *looks* like an NFC commit (just an API change), but
it removes a couple of layers of abstraction and is probably causing
optimization differences in MSVC.

llvm-svn: 278572
2016-08-12 22:43:33 +00:00
Reid Kleckner 6ee00a2602 [Inliner] Don't treat inalloca allocas as static
They aren't static, and moving them to the entry block across something
else will only result in tears.

Root cause of http://crbug.com/636558.

llvm-svn: 278571
2016-08-12 22:23:04 +00:00
Pete Cooper ab47fa643b Add support to paternmatch for simple const Value cases.
Pattern match has some paths which can operate on constant instructions,
but not all.  This adds a version of m_value() to return const Value* and
changes ICmp matching to use auto so that it can match both constant and
mutable instructions.

Tests also included for both mutable and constant ICmpInst matching.

This will be used in a future commit to constify ValueTracking.cpp.

llvm-svn: 278570
2016-08-12 22:16:05 +00:00
Tim Shen e78e32a443 [ADT] Add filter_iterator for filtering elements
Differential Revision: https://reviews.llvm.org/D22951

llvm-svn: 278569
2016-08-12 22:03:28 +00:00
Artem Belevich 2f0a3dfe64 [NVPTX] Use untyped (.b) integer registers in PTX.
This bring LLVM-generated PTX closer to what nvcc generates and avoids
triggering issues in ptxas.

For instance, ptxas does not accept .s16 (or .u16) registers as operands
for .fp16 instructions.

Differential Revision: https://reviews.llvm.org/D23460

llvm-svn: 278568
2016-08-12 22:02:19 +00:00
Saleem Abdulrasool 0bc85613f7 gold: add a cast to appease std::max NFC
llvm-svn: 278567
2016-08-12 21:56:12 +00:00
Teresa Johnson 358657f27e [PM] BitcodeWriterPass should derive from PassInfoMixin
Summary:
The BitcodeWriterPass was ported a couple years ago, and predates the
PassInfoMixin. Make BitcodeWriterPass from that base class.

Should BitcodeWriterPass be added to the PassRegistry.def file? It seems
like that is only for passes that can be added arbitrarily, e.g. via the
-passes flag to the opt tool. Whereas the bitcode writer is added
specially based on the output type (and requires an output stream and
other parameters). For now I have left it out of the PassRegistry, but
let me know if it should go there.

Finally, I was considering an NFC change of the legacy WriteBitcodePass
to BitcodeWriterLegacyPass to make its usage clearer and more consistent
with other legacy passes. WDYT?

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23465

llvm-svn: 278566
2016-08-12 21:33:36 +00:00
David L Kreitzer 9667417a1a Fixed typo.
llvm-svn: 278565
2016-08-12 21:06:53 +00:00
Krzysztof Parzyszek f285963608 [Hexagon] Cleanup and standardize vector load/store pseudo instructions
Remove the following single-vector load/store pseudo instructions, use real
instructions instead:
  LDriv_pseudo_V6         STriv_pseudo_V6
  LDriv_pseudo_V6_128B    STriv_pseudo_V6_128B
  LDrivv_indexed          STrivv_indexed
  LDrivv_indexed_128B     STrivv_indexed_128B

Rename the double-vector load/store pseudo instructions, add unaligned
counterparts:

  -- old --               -- new --            -- unaligned --
  LDrivv_pseudo_V6        PS_vloadrw_io        PS_vloadrwu_io
  LDrivv_pseudo_V6_128B   PS_vloadrw_io_128B   PS_vloadrwu_io_128B
  STrivv_pseudo_V6        PS_vstorerw_io       PS_vstorerwu_io
  STrivv_pseudo_V6_128B   PS_vstorerw_io_128   PS_vstorerwu_io_128

llvm-svn: 278564
2016-08-12 21:05:05 +00:00
Kostya Serebryany 5d70d82f60 [libFuzzer] fix typo in docs
llvm-svn: 278563
2016-08-12 20:42:24 +00:00
Eli Friedman f184e4befc [AArch64LoadStoreOptimizer] Check aliasing correctly when creating paired loads/stores.
The existing code accidentally skipped the aliasing check in edge cases.

Differential revision: https://reviews.llvm.org/D23372

llvm-svn: 278562
2016-08-12 20:39:51 +00:00
Mike Aizatsky f4fdb5ddf3 [AArch64] Registering default MCInstrAnalysis
Even in this form it is useful: it can detect branch instructions.

https://github.com/google/sanitizers/issues/706

Subscribers: aemerson, rengolin

Differential Revision: https://reviews.llvm.org/D23426

llvm-svn: 278560
2016-08-12 20:28:05 +00:00
Eli Friedman 8585e9d33d [AArch64LoadStoreOpt] Handle offsets correctly for post-indexed paired loads.
Trunk would try to create something like "stp x9, x8, [x0], #512", which isn't actually a valid instruction.

Differential revision: https://reviews.llvm.org/D23368

llvm-svn: 278559
2016-08-12 20:28:02 +00:00
Chris Bieneman c5e19b6f8d Remove autoconf references from LICENSE.TXT
Since we don't actually have the autoconf subdirectories anymore, we don't need this reference here.

llvm-svn: 278558
2016-08-12 20:11:03 +00:00
Kevin Enderby c614d283b7 Next set of additional error checks for invalid Mach-O files.
This contains the two missing checks for LC_SEGMENT load command fields.
And checks for the Mach-O sections fields that would make them invalid.

With the new checks, some of the existing malformed file checks now trips one
of these instead of the issue it was having before so those tests were adjusted.

llvm-svn: 278557
2016-08-12 20:10:25 +00:00
Mike Aizatsky 17a907588c [sancov] test file cleanup
llvm-svn: 278556
2016-08-12 20:06:32 +00:00
Mike Aizatsky 3c4d60ad89 [sancov] MachO indirect symbols support.
Differential Revision: https://reviews.llvm.org/D23338

llvm-svn: 278551
2016-08-12 19:25:59 +00:00
Tim Shen dc698c3e91 [PPC] Memoize getValueBits. NFC.
Summary: It triggers exponential behavior when the DAG has many branches.

Reviewers: hfinkel, kbarton

Subscribers: iteratee, nemanjai, echristo

Differential Revision: https://reviews.llvm.org/D23428

llvm-svn: 278548
2016-08-12 18:40:04 +00:00
Benjamin Kramer 9bc1b230fd [WebAssembly] Plug MachineMemOperand leaks.
llvm-svn: 278545
2016-08-12 18:33:50 +00:00
Dan Liew ed3c9cae49 [LibFuzzer] Fix `-jobs=<N>` where <N> > 1 and the number of workers is > 1 on macOS.
The original `ExecuteCommand()` called `system()` from the C library.
The C library implementation of this on macOS contains a mutex which
serializes calls to `system()`. This prevented the `-jobs=` flag
from running copies of the fuzzing binary in parallel which is
the opposite of what is intended.

To fix this on macOS an alternative implementation of `ExecuteCommand()`
is provided that can be used concurrently. This is provided in
`FuzzerUtilDarwin.cpp` which is guarded to only compile code on Apple
platforms. The existing implementation has been moved to a new file
`FuzzerUtilLinux.cpp` which is guarded to only compile code on Linux.

This commit includes a simple test to check that LibFuzzer is being
executed in parallel when requested.

Differential Revision: https://reviews.llvm.org/D22742

llvm-svn: 278544
2016-08-12 18:29:36 +00:00
Duncan P. N. Exon Smith b1669ba1ed ADT: Remove stale header comments about next/prev after r278532
Thanks to Mehdi for noticing.

llvm-svn: 278542
2016-08-12 18:14:42 +00:00
Duncan P. N. Exon Smith ad9a5951d9 Hide type trait from r278532 from MSVC
The fixup from r278537 was insufficient.  Just #ifdef it out for MSVC.

llvm-svn: 278539
2016-08-12 18:10:29 +00:00
Duncan P. N. Exon Smith ef07ac9584 Try to appease win7 bots after r278532 by cleaning up type trait
The HasGetNext type trait was cluttered with a few things it didn't
need.  Try to clean it up, hoping to fix windows bots:
  http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/38063

I may just have to delete the trait...

llvm-svn: 278537
2016-08-12 17:54:54 +00:00
David Majnemer 600158f959 Remove whitespace
llvm-svn: 278536
2016-08-12 17:53:06 +00:00
Duncan P. N. Exon Smith 0a4d57172a ADT: Remove the ilist_nextprev_traits customization point
No one is using the capability to implement next and prev another way
(since lld stopped doing it in r278468).  Remove the customization point
by moving the API from ilist_nextprev_traits<T> to ilist_node_access.

The old traits class is still useful/necessary API as a target for
friends of node types that inherit privately from ilist_node.
Eventually I plan to either remove it entirely or move the template
parameters to the methods.

(Note: if there's desire to bring back customization of next/prev
pointers in the future (e.g., to pack some bits in there), I think a
traits class like this is an awkward way to accomplish it.  Instead, we
should change ilist<T> to be ilist<ilist_node<T>>, and give an extra
template parameter to ilist_node.)

llvm-svn: 278532
2016-08-12 17:32:34 +00:00
Michael Kuperstein 31b8399beb [PM] Port LowerInvoke to the new pass manager
llvm-svn: 278531
2016-08-12 17:28:27 +00:00
Pete Cooper 980a935e27 constify InstCombine::foldAllocaCmp. NFC.
This is part of an effort to constify ValueTracking.cpp.  This change is
to methods which need const Value* instead of Value* to go with the upcoming
changes to ValueTracking.

llvm-svn: 278528
2016-08-12 17:13:28 +00:00
Dehao Chen c0a1e432c7 Fine tuning of sample profile propagation algorithm.
Summary: The refined propagation algorithm is more accurate and robust.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23224

llvm-svn: 278522
2016-08-12 16:22:12 +00:00
Artur Pilipenko 87e4038a91 [x86] X86ISelLowering zext(add_nuw(x, C)) --> add(zext(x), C_zext)
Currently X86ISelLowering has a similar transformation for sexts:
sext(add_nsw(x, C)) --> add(sext(x), C_sext)

In this change I extend this code to handle zexts as well.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D23359

llvm-svn: 278520
2016-08-12 16:08:30 +00:00
Ehsan Amiri 17e1701075 [BasicAA] Avoid calling GetUnderlyingObject, when the result of a previous call can be reused.
Recursive calls to aliasCheck from alias[GEP|Select|PHI] may result in a second call to GetUnderlyingObject for a Value, whose underlying object is already computed. This patch ensures that in this situations, the underlying object is not computed again, and the result of the previous call is resued.

https://reviews.llvm.org/D22305

llvm-svn: 278519
2016-08-12 16:05:03 +00:00
Artur Pilipenko 2e8f82d962 [LVI] Take guards into account
Teach LVI to gather control dependant constraints from guards.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23358

llvm-svn: 278518
2016-08-12 15:52:23 +00:00
Teresa Johnson 77a1f7566c Add move ops to satisfy MSVC.
Try to appease Windows bots after r278508:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/27250
http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/14776

llvm-svn: 278517
2016-08-12 15:39:26 +00:00
Geoff Berry 22dfbc5637 [AArch64] Re-factor code shared by AArch64LoadStoreOpt and AArch64InstrInfo.
This re-factoring could cause the following slight changes in generated
code, though none were observed during testing:

- MachineScheduler could decide not to cluster some loads/stores if
  there are other load/stores with non-pairable opcodes that have the
  same base register and offset as a pairable set of load/stores.  One
  case of different MachineScheduler pairing did show up in my testing,
  but it wasn't due to this issue, but due
  BaseMemOpClusterMutation::clusterNeighboringMemOps() being unstable
  w.r.t. the order it considers memory operations.  See PR28942.

- The ImplicitNullChecks optimization could be done for more load/store
  opcodes.  This optimization isn't done for C/C++ code, so it didn't
  show up in my testing.

Reviewers: mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D23365

llvm-svn: 278515
2016-08-12 15:26:00 +00:00
Artur Pilipenko b623088abe [LVI] Fix potential memory corruption in getValueFromCondition
Rewrite Visited[Cond] = getValueFromConditionImpl(..., Visited) statement which can lead to a memory corruption since getValueFromConditionImpl changes Visited map and invalidates the iterators.

llvm-svn: 278514
2016-08-12 15:08:15 +00:00
Duncan P. N. Exon Smith 0d2ed35d3e ADT: Share code for embedded sentinel traits, NFC
Share code for the (mostly problematic) embedded sentinel traits.
- Move the LLVM_NO_SANITIZE("object-size") attribute to
  ilist_half_embedded_sentinel_traits and ilist_embedded_sentinel_traits
  (previously it spread throughout the code duplication).
- Add an ilist_full_embedded_sentinel_traits which has no UB (but has
  the downside of storing the complete node).
- Replace all the custom sentinel traits in LLVM with a declaration of
  ilist_sentinel_traits that inherits from one of the embedded sentinel
  traits classes.

There are still custom sentinel traits in other LLVM subprojects.  I'll
remove those in a follow-up.

Nothing at all should be changing here, this is just rearranging code.
Note that the final goal here is to remove the sentinel traits
altogether, settling on the memory layout of
ilist_half_embedded_sentinel_traits without the UB.  This intermediate
step moves the logic into ilist.h.

llvm-svn: 278513
2016-08-12 15:00:55 +00:00
Teresa Johnson ad7eb9db46 Fix type to avoid problems on 32-bit builds
lto::InputFile::Symbol::getCommonSize should return uint64_t instead of
size_t since it is returning the result of DataLayout::getTypeAllocSize
which returns uint64_t, and the result of getCommonSize is assigned to a
uint64_t variable. On 32-bit builds size_t is unsigned int and there are
type errors. This was introduced in r278338.

llvm-svn: 278512
2016-08-12 14:55:43 +00:00
James Y Knight 2cc9da9a65 Revert "[Sparc] Leon errata fix passes."
...and the two followup commits:
Revert "[Sparc][Leon] Missed resetting option flags from check-in 278489."
Revert "[Sparc][Leon] Errata fixes for various errata in different
versions of the Leon variants of the Sparc 32 bit processor."

This reverts commit r274856, r278489, and r278492.

llvm-svn: 278511
2016-08-12 14:48:09 +00:00
Teresa Johnson 4223dd8559 [PM] Port NameAnonFunction pass to new pass manager
Summary:
Port the NameAnonFunction pass and add a test.

Depends on D23439.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23440

llvm-svn: 278509
2016-08-12 14:03:36 +00:00
Teresa Johnson f93b246f8b [PM] Port ModuleSummaryIndex analysis to new pass manager
Summary:
Port the ModuleSummaryAnalysisWrapperPass to the new pass manager.
Use it in the ported BitcodeWriterPass (similar to how we use the
legacy ModuleSummaryAnalysisWrapperPass in the legacy WriteBitcodePass).

Also, pass the -module-summary opt flag through to the new pass
manager pipeline and through to the bitcode writer pass, and add
a test that uses it.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23439

llvm-svn: 278508
2016-08-12 13:53:02 +00:00
Simon Pilgrim 687d71e877 [X86][SSE] Add support for combining target shuffles to PSLLDQ/PSRLDQ byte shifts
llvm-svn: 278502
2016-08-12 11:24:34 +00:00
Krzysztof Parzyszek be976d4ea9 [Hexagon] Standardize pseudo-instructions for calls and returns
- CALLv3nr        PS_call_nr
- CALLRv3nr       PS_callr_nr
- CALLstk         PS_call_stk

- TCRETURNi       PS_tailcall_i
- TCRETURNr       PS_tailcall_r

- JMPret          PS_jmpret
- JMPrett         PS_jmprett
- JMPretf         PS_jmpretf
- JMPrettnew      PS_jmprettnew
- JMPretfnew      PS_jmpretfnew
- JMPrettnewpt    PS_jmprettnewpt
- JMPretfnewpt    PS_jmpretfnewpt

llvm-svn: 278499
2016-08-12 11:12:02 +00:00
Krzysztof Parzyszek ab9127ca3c [Hexagon] Treat non-returning indirect calls as scheduling boundaries
llvm-svn: 278498
2016-08-12 11:01:10 +00:00
Artur Pilipenko 6669f253d5 [LVI] Take range metadata into account while calculating icmp condition constraints
Take range metadata into account for conditions like this:

%length = load i32, i32* %length_ptr, !range !{i32 0, i32 2147483647}
%cmp = icmp ult i32 %a, %length

This is a common pattern for range checks where the length of the array is dynamically loaded.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23267

llvm-svn: 278496
2016-08-12 10:14:11 +00:00
Benjamin Kramer 05e760ec4b [Webassembly] disable unstable test.
It reads uninitialized memory and crashes randomly.

llvm-svn: 278495
2016-08-12 10:13:45 +00:00
Simon Pilgrim ed96b9adfb [X86][SSE] Fixed PALIGNR target shuffle decode
The PALIGNR target shuffle decode was not taking into account that DecodePALIGNRMask (rather oddly) expects the operands to be in reverse order, nor was it detecting unary patterns, causing combines to combine with the incorrect input.

The cgbuiltin, auto upgrade and instruction comments code correctly swap the operands so are not affected.

llvm-svn: 278494
2016-08-12 10:10:51 +00:00
Artur Pilipenko 635625855f [LVI] Handle any predicate in comparisons like icmp <pred> (add Val, Offset), ...
Currently LVI can only gather value constraints from comparisons like:

* icmp <pred> Val, ...
* icmp ult (add Val, Offset), ...

In fact we can handle any predicate in latter comparisons.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D23357

llvm-svn: 278493
2016-08-12 10:05:11 +00:00
Chris Dewhurst 5247af24c3 [Sparc][Leon] Missed resetting option flags from check-in 278489.
llvm-svn: 278492
2016-08-12 09:54:39 +00:00
Chris Dewhurst 829f8efe55 [Sparc][Leon] Errata fixes for various errata in different versions of the Leon variants of the Sparc 32 bit processor.
The nature of the errata are listed in the comments preceding the errata fix passes. Relevant unit tests are implemented for each of these.

These changes update older versions of these errata fixes with improvements to code and unit tests.

Differential Revision: https://reviews.llvm.org/D21960

llvm-svn: 278489
2016-08-12 09:34:26 +00:00
Benjamin Kramer bbff9c6130 [Coroutines] Move class into anonymous namespace.
Hopefully fixes visibility warnings from GCC. No functionality change.

llvm-svn: 278485
2016-08-12 08:47:13 +00:00
Haicheng Wu d9cbb1608f Revert "[BranchFolding] Restrict tail merging loop blocks after MBP"
This reverts commit r278463 because it hits the bot.

llvm-svn: 278484
2016-08-12 08:40:24 +00:00
Gor Nishanov 0f303accde [Coroutines]: Part6b: Add coro.id intrinsic.
Summary:
1. Make coroutine representation more robust against optimization that may duplicate instruction by introducing coro.id intrinsics that returns a token that will get fed into coro.alloc and coro.begin. Due to coro.id returning a token, it won't get duplicated and can be used as reliable indicator of coroutine identify when a particular coroutine call gets inlined.
2. Move last three arguments of coro.begin into coro.id as they will be shared if coro.begin will get duplicated.
3. doc + test + code updated to support the new intrinsic.

Reviewers: mehdi_amini, majnemer

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23412

llvm-svn: 278481
2016-08-12 05:45:49 +00:00
Duncan P. N. Exon Smith f197b1f78f ADT: Remove all ilist_iterator => pointer casts, NFC
Remove all ilist_iterator to pointer casts.  There were two reasons for
casts:

  - Checking for an uninitialized (i.e., null) iterator.  I added
    MachineInstrBundleIterator::isValid() to check for that case.

  - Comparing an iterator against the underlying pointer value while
    avoiding converting the pointer value to an iterator.  This is
    occasionally necessary in MachineInstrBundleIterator, since there is
    an assertion in the constructors that the underlying MachineInstr is
    not bundled (but we don't care about that if we're just checking for
    pointer equality).

To support the latter case, I rewrote the == and != operators for
ilist_iterator and MachineInstrBundleIterator.

  - The implicit constructors now use enable_if to exclude
    const-iterator => non-const-iterator conversions from overload
    resolution (previously it was a compiler error on instantiation, now
    it's SFINAE).

  - The == and != operators are now global (friends), and are not
    templated.

  - MachineInstrBundleIterator has overloads to compare against both
    const_pointer and const_reference.  This avoids the implicit
    conversions to MachineInstrBundleIterator that assert, instead just
    checking the address (and I added unit tests to confirm this).

Notably, the only remaining uses of ilist_iterator::getNodePtrUnchecked
are in ilist.h, and no code outside of ilist*.h directly relies on this
UB end-iterator-to-pointer conversion anymore.  It's still needed for
ilist_*sentinel_traits, but I'll clean that up soon.

llvm-svn: 278478
2016-08-12 05:05:36 +00:00
David Majnemer 91a02f5bee Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278477
2016-08-12 04:32:45 +00:00
David Majnemer 2d006e7673 Use the range variant of transform instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278476
2016-08-12 04:32:42 +00:00
David Majnemer c700490f48 Use the range variant of remove_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278475
2016-08-12 04:32:37 +00:00
David Majnemer 0da5afe717 Use the range variant of count_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278474
2016-08-12 04:32:29 +00:00
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
Duncan P. N. Exon Smith fb14ed0777 ADT: Add ilist_iterator conversions to/from ilist_node
Allow an ilist_iterator to be constructed from an ilist_node, and give
access to the underlying ilist_node as well.

This will be used immediately in lld to support a type-erasure use case.
Longer term, they'll stick around once the iterator is using
ilist_node<NodeTy>* instead of NodeTy*.

llvm-svn: 278467
2016-08-12 03:35:33 +00:00
Wei Mi 7e103d92cc Recommit 'Remove the restriction that MachineSinking is now stopped by
"insert_subreg, subreg_to_reg, and reg_sequence" instructions' after
adjusting some unittest checks.

This is to solve PR28852. The restriction was added at 2010 to make better register
coalescing. We assumed that it was not necessary any more. Testing results on x86
supported the assumption.

We will look closely to any performance impact it will bring and will be prepared
to help analyzing performance problem found on other architectures.

Differential Revision: https://reviews.llvm.org/D23210

llvm-svn: 278466
2016-08-12 03:33:22 +00:00
Haicheng Wu ea02372059 [BranchFolding] Restrict tail merging loop blocks after MBP
To fix PR28014, this patch restricts tail merging to blocks that belong to the
same loop after MBP.

Differential Revision: https://reviews.llvm.org/D23191

llvm-svn: 278463
2016-08-12 03:30:23 +00:00
Ivan Krasin 89439a7939 WholeProgramDevirt: initialize WasDevirt in all constructors.
Summary: This is a follow up to r278389 and r278442.

Differential Revision: https://reviews.llvm.org/D23438

llvm-svn: 278455
2016-08-12 01:40:10 +00:00
Eli Friedman a6707f56b5 [DSE] Don't remove stores made live by a call which unwinds.
Issue exposed by noalias or more aggressive alias analysis.

Fixes http://llvm.org/PR25422.

Differential revision: https://reviews.llvm.org/D21007

llvm-svn: 278451
2016-08-12 01:09:53 +00:00
Pete Cooper 54a0255679 Refactor isValidAssumeForContext to reduce duplication and indentation. NFC.
This method had some duplicate code when we did or did not have a dom tree.  Refactor
it to remove the duplication, but also clean up the control flow to have less duplication.

llvm-svn: 278450
2016-08-12 01:00:15 +00:00
David Majnemer 562e82945e Use the range variant of find_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278443
2016-08-12 00:18:03 +00:00
Ivan Krasin 067cc24aaf WholeProgramDevirt: fix access to a non-initialized field.
Summary: This is a follow up to r278389, where I have introduced the bug

Reviewers: mehdi_amini

Differential Revision: https://reviews.llvm.org/D23436

llvm-svn: 278442
2016-08-12 00:07:14 +00:00
Xinliang David Li 1ce88fa0a5 Add comment /NFC
llvm-svn: 278438
2016-08-11 23:09:56 +00:00
Tim Shen 6aaeb9b185 [ADT] Migrate DepthFirstIterator to use NodeRef
Summary:
Notice that the data layout is changed: instead of using
std::pair<PointerIntPair<NodeType*, 1>, ChildItTy>, now use
std::pair<NodeRef, Optional<ChildItTy>>.

A NFC but worth noticing change is operator==(), since we only compare
an iterator against end(), it's better to put an assert there and make
people noticed when it fails.

Reviewers: dblaikie, chandlerc

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D23146

llvm-svn: 278437
2016-08-11 22:36:16 +00:00
Xinliang David Li cbb5e02f4a Fix typos /NFC
llvm-svn: 278436
2016-08-11 22:34:00 +00:00
Pete Cooper fa7ae4f3b6 Remove unnecessary extra version of isValidAssumeForContext. NFC.
There were 2 versions of this method.  A public one which takes a
const Instruction* and a private implementation which takes a mutable
Value* and casts to an Instruction*.

There was no need for the 2 versions as all callers pass a const Instruction*
and there was no need for a mutable pointer as we only do analysis here.

llvm-svn: 278434
2016-08-11 22:23:07 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Piotr Padlewski 332b3b2210 Don't import variadic functions
Summary:
This patch adds IsVariadicFunction bit to summary in order
to not import variadic functions. Inliner doesn't inline
variadic functions because it is hard to reason about it.

This one small fix improves Importer by about 16%
(going from 86% to 100% of imported functions that are
inlined anywhere)
on some spec benchmarks like 'int' and others.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23339

llvm-svn: 278432
2016-08-11 22:13:57 +00:00
Vyacheslav Klochkov 6daefcf626 X86-FMA3: Implemented commute transformation for EVEX/AVX512 FMA3 opcodes.
This helped to improved memory-folding and register coalescing optimizations.

Also, this patch fixed the tracker #17229.

Reviewer: Craig Topper.
Differential Revision: https://reviews.llvm.org/D23108

llvm-svn: 278431
2016-08-11 22:07:33 +00:00
Rui Ueyama 8950a53821 Re-commit r278066: Do not ignore SizeOfOptionalHeader in COFF header even if PE header is not present.
llvm-svn: 278429
2016-08-11 22:02:44 +00:00
Tim Northover 8e0c53a018 GlobalISel: support 'null' constant in translation.
It's sharing the integer G_CONSTANT for now since I don't *think* it creates
any ambiguity (even on weird archs). If that turns out wrong we can create a
G_PTRCONSTANT or something.

llvm-svn: 278423
2016-08-11 21:40:55 +00:00
Ehsan Amiri dbcfea9811 Extend trip count instead of truncating IV in LFTR, when legal
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because

(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).

I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.

To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.

This commit contains changes in a newly added testcase which was not included in the previous commit (which was reverted later on).

https://reviews.llvm.org/D23075

llvm-svn: 278421
2016-08-11 21:31:40 +00:00
Daniel Berlin da2f38e0f4 [MSSA] Use is_contained
llvm-svn: 278418
2016-08-11 21:26:50 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Krzysztof Parzyszek 1b689da04e [Hexagon] Allow non-returning calls in hardware loops
llvm-svn: 278416
2016-08-11 21:14:25 +00:00
David Majnemer 3124181eef [vim] Add more attributes to llvm.vim
llvm-svn: 278415
2016-08-11 21:14:05 +00:00
Matt Arsenault 18da70dd2d AMDGPU: Remove unused tablegen utilities
llvm-svn: 278414
2016-08-11 21:08:43 +00:00
Geoff Berry d01828096f [SCEV] Update interface to handle SCEVExpander insert point motion.
Summary:
This is an extension of the fix in r271424.  That fix dealt with builder
insert points being moved by SCEV expansion, but only for the lifetime
of the expand call.  This change modifies the interface so that LSR can
safely call expand multiple times at the same insert point and do the
right thing if one of the expansions decides to move the original insert
point.

This is a fix for PR28719.

Reviewers: sanjoy

Subscribers: llvm-commits, mcrosier, mzolotukhin

Differential Revision: https://reviews.llvm.org/D23342

llvm-svn: 278413
2016-08-11 21:05:17 +00:00
Tim Northover da6f5f2d0a Remove empty file left by partial reversion.
llvm-svn: 278411
2016-08-11 21:01:15 +00:00
Tim Northover 30e67ce793 GlobalISel: add translation support for shift operations.
llvm-svn: 278410
2016-08-11 21:01:13 +00:00
Tim Northover f1f7bf1279 GlobalISel: support zext & sext during translation phase.
llvm-svn: 278409
2016-08-11 21:01:10 +00:00
Teresa Johnson faa7506f18 Fix type truncation warnings
Avoid type truncation warnings from a 32-bit bot due to size_t not
being unsigned long long, by converting the variables and constants to
unsigned. This was introduced by r278338 and caused warnings here:
http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/15527/steps/build_llvmclang/logs/warnings%20%287%29

llvm-svn: 278406
2016-08-11 20:38:39 +00:00
Daniel Berlin 2698cbb4f1 Move GVNHoist tests into their own directory since it is a separate pass
llvm-svn: 278404
2016-08-11 20:35:07 +00:00
Wei Ding 70cda07526 AMDGPU : Add intrinsic for instruction v_cvt_pk_u8_f32
Differential Revision: http://reviews.llvm.org/D23336

llvm-svn: 278403
2016-08-11 20:34:48 +00:00
Wei Mi 3ab5816000 Revert rL278384 which caused several buildbot failures (like check failures in CodeGen/X86/clz.ll).
llvm-svn: 278402
2016-08-11 20:33:37 +00:00
Daniel Berlin f75fd1b58b Fix PR 28933
Summary:
This fixes PR 28933 by making sure GVNHoist does not try to recreate memory
accesses when it has not actually moved them.

Reviewers: sebpop

Subscribers: llvm-commits, george.burgess.iv

Differential Revision: https://reviews.llvm.org/D23411

llvm-svn: 278401
2016-08-11 20:32:43 +00:00
Tim Shen 0fdb2daa8d [ADT] Add relation operators for Optional
Summary: Make Optional's behavior the same as the coming std::optional.

Reviewers: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23178

llvm-svn: 278397
2016-08-11 20:10:15 +00:00
Duncan P. N. Exon Smith 38eea4a76f CodeGen: Avoid dereferencing end() in MachineScheduler
Check MachineInstr::isDebugValue for the same instruction as we're
calling isSchedBoundary, avoiding the possibility of dereferencing
end().

This is a functionality change even when I!=end().  Matthias had a look
and agrees this is the right resolution (as opposed to checking for
end()).

This is triggered by a huge number of tests, but they happen to
magically pass right now.  I found this because WIP patches for PR26753
convert them into crashes.

llvm-svn: 278394
2016-08-11 20:03:09 +00:00
Matt Arsenault 2ffe8fd2ce AMDGPU: Prune includes
llvm-svn: 278391
2016-08-11 19:18:50 +00:00
Krzysztof Parzyszek 258af19d99 [Hexagon] Standardize "select" pseudo-instructions
- PS_pselect: general register pairs
- PS_vselect: vector registers (+ 128B version)
- PS_wselect: vector register pairs (+ 128B version)

llvm-svn: 278390
2016-08-11 19:12:18 +00:00
Ivan Krasin f3403fd2c8 WholeProgramDevirt: generate more detailed and accurate remarks.
Summary:
Keep track of all methods for which we have devirtualized at least
one call and then print them sorted alphabetically. That allows to
avoid duplicates and also makes the order deterministic.

Add optimization names into the remarks, so that it's easier to
understand how has each method been devirtualized.

Fix a bug when wrong methods could have been reported for
tryVirtualConstProp.

Reviewers: kcc, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23297

llvm-svn: 278389
2016-08-11 19:09:02 +00:00
Wei Mi ec19b35179 Remove the restriction that MachineSinking is now stopped by "insert_subreg,
subreg_to_reg, and reg_sequence" instructions.

This is to solve PR28852. The restriction was added at 2010 to make better register
coalescing. We assumed that it was not necessary any more. Testing results on x86
supported the assumption.

We will look closely to any performance impact it will bring and will be prepared
to help analyzing performance problem found on other architectures.

Differential Revision: https://reviews.llvm.org/D23210

llvm-svn: 278384
2016-08-11 18:42:56 +00:00
Krzysztof Parzyszek a003b76391 If-conversion incorrectly calculates liveness of redefined registers
Differential Revision: https://reviews.llvm.org/D23207

llvm-svn: 278383
2016-08-11 18:42:06 +00:00
Barnabas Bittner 35fef211a0 Test commit
llvm-svn: 278380
2016-08-11 18:34:29 +00:00
Andrew Kaylor 7cdf01ef58 Target independent codesize heuristics for Loop Idiom Recognition
Patch by Sunita Marathe

Differential Revision: https://reviews.llvm.org/D21449

llvm-svn: 278378
2016-08-11 18:28:33 +00:00
Easwaran Raman 61edc107bb Add a new method to create SimpleInliner instance and make pre-inliner use this.
This adds a createFunctionInliningPass pass that takes an InlineParams object and use this to create the pre-inliner pass. This prevents the regular inliner's threshold flag from influencing the preinliner.

Differential revision: https://reviews.llvm.org/D23377

llvm-svn: 278377
2016-08-11 18:24:08 +00:00
Krzysztof Parzyszek 60f0b51485 [Hexagon] Skip byval arguments when checking parameter attributes
From the point of view of register assignment, byval parameters are
ignored: a byval parameter is not going to be assigned to a register,
and it will not affect the assignments of subsequent parameters.
When matching registers with parameters in the bit tracker, make sure
to skip byval parameters before advancing the registers.

llvm-svn: 278375
2016-08-11 18:15:16 +00:00
Dominic Chen 6ba19659cb Improve virtual register handling when computing debug information
Summary: Some backends, like WebAssembly, use virtual registers instead of physical registers. This crashes the DbgValueHistoryCalculator pass, which assumes that all registers are physical. Instead, skip virtual registers when iterating aliases, and assume that they are clobbered.

Reviewers: dexonsmith, dschuff, aprantl

Subscribers: yurydelendik, llvm-commits, jfb, sunfish

Differential Revision: https://reviews.llvm.org/D22590

llvm-svn: 278371
2016-08-11 17:52:40 +00:00
Michael Kuperstein e36d7716c3 Make TwoAddressInstructionPass::rescheduleMIBelowKill subreg-aware
This fixes PR28824.

Differential Revision: https://reviews.llvm.org/D23220

llvm-svn: 278370
2016-08-11 17:38:33 +00:00
Matt Arsenault 56684d4538 AMDGPU: Fix crashes on memory functions
llvm-svn: 278369
2016-08-11 17:31:42 +00:00
Matt Arsenault 76837df6ff AArch64: Assert on analyzeBranch failing
llvm-svn: 278366
2016-08-11 17:22:59 +00:00
Michael Kuperstein ee900b62ef [AliasSetTracker] Delete dead code
Deletes unused remove() and containsPointer() interfaces. NFC.

Differential Revision: https://reviews.llvm.org/D23360

llvm-svn: 278365
2016-08-11 17:20:20 +00:00
Eugene Zelenko cdc7161281 Fix some Clang-tidy modernize and Include What You Use warnings.
Differential revision: https://reviews.llvm.org/D23291

llvm-svn: 278364
2016-08-11 17:20:18 +00:00
Teresa Johnson fe24bff8c6 Add move ops to satisfy MSVC.
Try to appease MSVC bot:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/27164/steps/run%20tests/logs/stdio

llvm-svn: 278363
2016-08-11 17:19:53 +00:00
Matt Arsenault 4b5fc093d0 AMDGPU: Remove custom getSubReg
This was kind of confusing, the subregister
class shouldn't really be necessary.

llvm-svn: 278362
2016-08-11 17:15:32 +00:00
Matt Arsenault 69fd2c1179 AMDGPU: Remove unused tracking of flat instructions
llvm-svn: 278361
2016-08-11 17:15:28 +00:00
Wei Ding d3344378c6 AMDGPU : Fix SAD related instruction LIT tests function atttibute issues.
Differential Revision: http://reviews.llvm.org/D23133

llvm-svn: 278360
2016-08-11 17:14:17 +00:00
Duncan P. N. Exon Smith ec30cc2171 Hexagon: Avoid dereferencing end() in HexagonCopyToCombine::findPairable
Check for end() before skipping through debug values.  This avoids
dereferencing end() when the instruction is the final one in the basic
block.  (It still assumes that a debug value will not be the final
instruction in the basic block.  No tests seemed to violate that.)

Many Hexagon tests trigger this, but they happen to magically pass right
now.  I found this because WIP patches for PR26753 convert them into
crashes.

llvm-svn: 278355
2016-08-11 16:40:03 +00:00
Wei Ding 34e1753585 AMDGPU : Add LLVM intrinsics for SAD related instructions.
Differential Revision: http://reviews.llvm.org/D23133

llvm-svn: 278354
2016-08-11 16:33:53 +00:00
Teresa Johnson 27ce116175 Add (hopefully last) remaining missing dependences to llvm-lto2
There are still a few missing symbols reported by:
http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/15535/steps/build_llvmclang/logs/stdio
http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/15535/steps/build_llvmclang/logs/stdio

This should hopefully take care of them.

llvm-svn: 278353
2016-08-11 16:29:47 +00:00
Tim Northover 0d51044b69 GlobalISel: clear vreg mapping after translating each function
Otherwise we only materialize (shared) constants in the first function they
appear in. This doesn't go well.

llvm-svn: 278351
2016-08-11 16:21:29 +00:00
Reid Kleckner 26f9e9ebc3 Remove FIXME about asserting on the end iterator
After machine block placement, MBBs may not have terminators, and it is
appropriate to check for the end iterator here. We can fold the check
into the next if, as well. This look is really just looking for BBs that
end in CATCHRET.

llvm-svn: 278350
2016-08-11 16:00:43 +00:00
Teresa Johnson fdfde19985 More missing llvm-lto2 dependencies
Follow-on to r278341: Update CMakeLists.txt to match LLVMBuild.txt

llvm-svn: 278349
2016-08-11 15:58:49 +00:00
Lang Hames 30526070ab [MCJIT] Improve documentation and error handling for MCJIT::runFunction.
ExecutionEngine::runFunction is supposed to allow execution of arbitrary
function types, but MCJIT can only reasonably support a limited subset of
main-linke function types. This patch documents this limitation, and fixes
MCJIT::runFunction to abort with a meaningful error at runtime if called with
an unsupported function type.

llvm-svn: 278348
2016-08-11 15:56:23 +00:00
Duncan P. N. Exon Smith 62e351f5a4 X86: Use operator lookup for operator==, NFC
Avoid relying on the MachineInstrBundleIterator operator== being
implemented as a member function.

llvm-svn: 278347
2016-08-11 15:51:29 +00:00
Duncan P. N. Exon Smith 43724649c3 IR: Don't cast the end iterator to Instruction*
End iterators are usually sentinels, not actually Instruction* at all.
Stop casting to it just to get an iterator back.

There is likely no observable functionality change here right now
(although this is relying on UB, I doubt it was triggering anything),
but I'll be removing the cast soon.

llvm-svn: 278346
2016-08-11 15:45:04 +00:00
Duncan P. N. Exon Smith 2e7af979b9 CodeGen: Check for a terminator in llvm::getFuncletMembership
Check for an end iterator from MachineBasicBlock::getFirstTerminator in
llvm::getFuncletMembership.  If this is turned into an assertion, it
fires in 48 X86 testcases (for example,
CodeGen/X86/regalloc-spill-at-ehpad.ll).

Since this is likely a latent bug (shouldn't all basic blocks end with a
terminator?) I've filed PR28938.

llvm-svn: 278344
2016-08-11 15:29:02 +00:00
Matthew Simpson 3f69195b9e [SLP] Make RecursionMaxDepth a command line option (NFC)
llvm-svn: 278343
2016-08-11 15:28:45 +00:00
Sanjay Patel 38ae83de38 fix comment; NFC
llvm-svn: 278342
2016-08-11 15:23:56 +00:00
Teresa Johnson a26adfb6a8 Fix bot failure from r278338 due to missing dependences
Add some missing dependences to the llvm-lto2 tool to attempt to appease
missing symbols in link from bot:
http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/15527

llvm-svn: 278341
2016-08-11 15:23:05 +00:00
Sanjay Patel e3c335cbed use auto* with dyn_cast ; NFC
llvm-svn: 278340
2016-08-11 15:21:21 +00:00
Sanjay Patel 5a470950b9 getParent()->getParent() == getFunction() ; NFC
llvm-svn: 278339
2016-08-11 15:16:06 +00:00
Teresa Johnson 9ba95f99f3 Restore "Resolution-based LTO API."
This restores commit r278330, with fixes for a few bot failures:
- Fix a late change I had made to the save temps output file that I
  missed due to existing files sitting on my disk
- Fix a bunch of Windows bot failures with "ambiguous call to overloaded
  function" due to confusion between llvm::make_unique vs
  std::make_unique (preface the new make_unique calls with "llvm::")
- Attempt to fix a modules bot failure by adding a missing include
  to LTO/Config.h.

Original change:

Resolution-based LTO API.

Summary:
This introduces a resolution-based LTO API. The main advantage of this API over
existing APIs is that it allows the linker to supply a resolution for each
symbol in each object, rather than the combined object as a whole. This will
become increasingly important for use cases such as ThinLTO which require us
to process symbol resolutions in a more complicated way than just adjusting
linkage.

Patch by Peter Collingbourne.

Reviewers: rafael, tejohnson, mehdi_amini

Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D20268

llvm-svn: 278338
2016-08-11 14:58:12 +00:00
Ehsan Amiri 3818f1b38a revert 278334
llvm-svn: 278337
2016-08-11 14:51:14 +00:00
Valery Pykhtin 82c73bee2b Revert "[AMDGPU] fix failure on printing of non-existing instruction operands."
This reverts revision 278333, newly added test failed.

llvm-svn: 278336
2016-08-11 14:22:05 +00:00
Ehsan Amiri b9fcc2b171 Extend trip count instead of truncating IV in LFTR, when legal
When legal, extending trip count in the loop control logic generates better code compared to truncating IV. This is because

(1) extending trip count is a loop invariant operation (see genLoopLimit where we prove trip count is loop invariant).
(2) Scalar Evolution seems to have problems understanding trunc when computing loop trip count. So removing them allows better analysis performed in Scalar Evolution. (In particular this fixes PR 28363 which is the motivation for this change).

I am not going to perform any performance test. Any degradation caused by this should be an indication of a bug elsewhere.

To prove legality, we rely on SCEV to prove zext(trunc(IV)) == IV (or similarly for sext). If this holds, we can prove equivalence of trunc(IV)==ExitCnt (1) and IV == zext(ExitCnt). Simply take zext of boths sides of (1) and apply the proven equivalence.

https://reviews.llvm.org/D23075

llvm-svn: 278334
2016-08-11 13:51:20 +00:00
Valery Pykhtin 3048ff6ec3 [AMDGPU] fix failure on printing of non-existing instruction operands.
Differential revision: https://reviews.llvm.org/D23323

llvm-svn: 278333
2016-08-11 13:49:46 +00:00
Teresa Johnson cbf684e6c6 Revert "Resolution-based LTO API."
This reverts commit r278330.

I made a change to the save temps output that is causing issues with the
bots. Didn't realize this because I had older output files sitting on
disk in my test output directory.

llvm-svn: 278331
2016-08-11 13:03:56 +00:00
Teresa Johnson f99573b3ee Resolution-based LTO API.
Summary:
This introduces a resolution-based LTO API. The main advantage of this API over
existing APIs is that it allows the linker to supply a resolution for each
symbol in each object, rather than the combined object as a whole. This will
become increasingly important for use cases such as ThinLTO which require us
to process symbol resolutions in a more complicated way than just adjusting
linkage.

Patch by Peter Collingbourne.

Reviewers: rafael, tejohnson, mehdi_amini

Subscribers: lhames, tejohnson, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D20268

Address review comments

llvm-svn: 278330
2016-08-11 12:56:40 +00:00
Simon Pilgrim 5c91764af5 Fixed VS2015 (Update 3) warning - differing const/volatile qualifiers for overridden function
Dropped the const qualifier to match llvm::CallLowering::lowerCall

llvm-svn: 278329
2016-08-11 12:19:43 +00:00
Igor Breger a77b14d02c [AVX512] Fix extractelement i1 lowering.
The previous implementation (not custom) doesn't enforce zeroing off upper bits. The assumption is that i1 PRODUCER (truncate and extractelement) must zero all upper bits, so i1 CONSUMER instructions ( test, zext, save, etc) can be done without additional zeroing.
Make extractelement i1 lowering custom for all vector i1.

Differential Revision: http://reviews.llvm.org/D23246

llvm-svn: 278328
2016-08-11 12:13:46 +00:00
Marina Yatsina 88f0c31f13 Avoid false dependencies of undef machine operands
This patch helps avoid false dependencies on undef registers by updating the machine instructions' undef operand to use a register that the instruction is truly dependent on, or use a register with clearance higher than Pref.

Pseudo example:

loop:
xmm0 = ...
xmm1 = vcvtsi2sdl eax, xmm0<undef>
... = inst xmm0
jmp loop

In this example, selecting xmm0 as the undef register creates false dependency between loop iterations.
This false dependency cannot be solved by inserting an xor before vcvtsi2sdl because xmm0 is alive at the point of the vcvtsi2sdl instruction.
Selecting a different register instead of xmm0, especially a register that is not used in the loop, will eliminate this problem.

Differential Revision: https://reviews.llvm.org/D22466

llvm-svn: 278321
2016-08-11 07:32:08 +00:00
Amjad Aboud b83e73bceb [Debug Info] Added a LIT test that covers the fix committed in rL277290.
Differential Revision: http://reviews.llvm.org/D23056

llvm-svn: 278320
2016-08-11 07:22:53 +00:00
Craig Topper a78b768ed4 [AVX-512] Promote 512-bit integer loads to v8i64 similar to what is done for 128/256-bit vectors for overall consistency.
llvm-svn: 278318
2016-08-11 06:04:07 +00:00
Craig Topper 14aa2665d3 [AVX-512] Add patterns to allow EVEX encoded stores of v16i16/v8i16/v16i8/v32i8 even when BWI is not supported.
llvm-svn: 278317
2016-08-11 06:04:04 +00:00
Craig Topper 3563d0f622 [AVX-512] Fix the 128-bit and 256-bit nontemporal load patterns with elements type other than i64. These loads have all been promoted to v2i64/v4i64 loads so we need bitcasts or we end up selecting VMOVDQA32/VMOVDQU32 instead.
llvm-svn: 278316
2016-08-11 06:04:00 +00:00
Xinliang David Li 76a0108be4 [Profile] improve warning control option
Change --no-pgo-warn-missing to -pgo-warn-missing-function
and negate the default. /NFC

Add more test to make sure the warning is off by default

llvm-svn: 278314
2016-08-11 05:09:30 +00:00
Dominic Chen 4173fffa08 [WebAssembly] Cleanup trailing whitespace
Summary: Test for commit access.

Subscribers: jfb, dschuff

Differential Revision: https://reviews.llvm.org/D23392

llvm-svn: 278313
2016-08-11 04:10:56 +00:00
Easwaran Raman 0d58fcac99 Make more fields of InlineParams Optional.
Differential revision: https://reviews.llvm.org/D23386

llvm-svn: 278312
2016-08-11 03:58:05 +00:00
Sanjoy Das 25fb5bda0f [Statepoints] Minor cosmetic change; NFC
The verification failure message was missing a space.

llvm-svn: 278309
2016-08-11 00:56:46 +00:00
Chris Bieneman ca5de9d9e3 [MachOYAML] Don't output empty ExportTrie
The YAML representation was always outputting the root node of an export trie even if the trie was empty. While this doesn't really have any functional impact, it does add visual clutter to the yaml file.

llvm-svn: 278307
2016-08-11 00:20:03 +00:00
Tim Northover 357f1be2ca GlobalISel: support same ConstantExprs as Instructions.
It's more than just inttoptr, but the others can't be tested until we have
support for non-trivial constants (they currently get unavoidably folded to a
ConstantInt).

llvm-svn: 278303
2016-08-10 23:02:41 +00:00
Tim Shen 113cfa0772 [ADT] Move LLVM_ATTRIBUTE_UNUSED_RESULT to the function, otherwise gcc 4.8 complains about it.
It's a fix for the original patch r278251.

llvm-svn: 278298
2016-08-10 22:35:38 +00:00
Tim Northover 2ff5935a95 GlobalISel: add tests forgotten in r278293.
llvm-svn: 278296
2016-08-10 22:13:48 +00:00
Sanjoy Das 63752e6372 [LangRef] Fix formatting (no semantic change)
llvm-svn: 278294
2016-08-10 21:48:24 +00:00
Tim Northover 406024a108 GlobalISel: implement simple function calls on AArch64.
We're still limited in the arguments we support, but this at least handles the
basic cases.

llvm-svn: 278293
2016-08-10 21:44:01 +00:00
Changpeng Fang fb9c3818dd AMDGPU/SI: Implement amdgcn image intrinsics with sampler
Summary:
  This patch define and implement amdgcn image intrinsics with sampler.

    1. define vdata type to be llvm_anyfloat_ty, address type to be llvm_anyfloat_ty,
       and rsrc type to be llvm_anyint_ty. As a result, we expect the intrinsics name
       to have three suffixes to overload each of these three types;

    2. D128 as well as two other flags are implied in the three types, for example,
       if you use v8i32 as resource type, then r128 is 0!

    3. don't expose TFE flag, and other flags are exposed in the instruction order:
       unrm, glc, slc, lwe and da.

Differential Revision: http://reviews.llvm.org/D22838

Reviewed by:
  arsenm and tstellarAMD

llvm-svn: 278291
2016-08-10 21:15:30 +00:00
Piotr Padlewski d89875ca39 Changed sign of LastCallToStaticBouns
Summary:
I think it is much better this way.
When I firstly saw line:
  Cost += InlineConstants::LastCallToStaticBonus;
I though that this is a bug, because everywhere where the cost is being reduced
it is usuing -=.

Reviewers: eraman, tejohnson, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23222

llvm-svn: 278290
2016-08-10 21:15:22 +00:00
Kyle Butt 81d32846b0 Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough.
If AnalyzeBranch can't analyze a block and it is possible to
fallthrough, then duplicating the block doesn't make sense, as only one
block can be the layout predecessor for the un-analyzable fallthrough.

Submitted wit a test case, but NOTE: the test case doesn't currently
fail. However, the test case fails with D20505 and would have saved me
some time debugging.

llvm-svn: 278288
2016-08-10 21:03:27 +00:00
Kyle Butt e1c931b171 CodeGen: If Convert blocks that would form a diamond when tail-merged.
The following function currently relies on tail-merging for if
conversion to succeed. The common tail of cond_true and cond_false is
extracted, and this then forms a diamond pattern that can be
successfully if converted.

If this block does not get extracted, either because tail-merging is
disabled or the threshold is higher, we should still recognize this
pattern and if-convert it.

Fixed a regression in the original commit. Need to un-reverse branches after
reversing them, or other conversions go awry.

define i32 @t2(i32 %a, i32 %b) nounwind {
entry:
        %tmp1434 = icmp eq i32 %a, %b           ; <i1> [#uses=1]
        br i1 %tmp1434, label %bb17, label %bb.outer

bb.outer:               ; preds = %cond_false, %entry
        %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ]
        %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ]
        br label %bb

bb:             ; preds = %cond_true, %bb.outer
        %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ]
        %tmp. = sub i32 0, %b_addr.021.0.ph
        %tmp.40 = mul i32 %indvar, %tmp.
        %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph
        %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph
        br i1 %tmp3, label %cond_true, label %cond_false

cond_true:              ; preds = %bb
        %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph
        %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph
        %indvar.next = add i32 %indvar, 1
        br i1 %tmp1437, label %bb17, label %bb

cond_false:             ; preds = %bb
        %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0
        %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10
        br i1 %tmp14, label %bb17, label %bb.outer

bb17:           ; preds = %cond_false, %cond_true, %entry
        %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ]
        ret i32 %a_addr.026.1
}

Without tail-merging or diamond-tail if conversion:
LBB1_1:                                 @ %bb
                                        @ =>This Inner Loop Header: Depth=1
        cmp     r0, r1
        ble     LBB1_3
@ BB#2:                                 @ %cond_true
                                        @   in Loop: Header=BB1_1 Depth=1
        subs    r0, r0, r1
        cmp     r1, r0
        it      ne
        cmpne   r0, r1
        bgt     LBB1_4
LBB1_3:                                 @ %cond_false
                                        @   in Loop: Header=BB1_1 Depth=1
        subs    r1, r1, r0
        cmp     r1, r0
        bne     LBB1_1
LBB1_4:                                 @ %bb17
        bx      lr

With diamond-tail if conversion, but without tail-merging:
@ BB#0:                                 @ %entry
        cmp     r0, r1
        it      eq
        bxeq    lr
LBB1_1:                                 @ %bb
                                        @ =>This Inner Loop Header: Depth=1
        cmp     r0, r1
        ite     le
        suble   r1, r1, r0
        subgt   r0, r0, r1
        cmp     r1, r0
        bne     LBB1_1
@ BB#2:                                 @ %bb17
        bx      lr

llvm-svn: 278287
2016-08-10 20:45:56 +00:00
Reid Kleckner 7cbd6b74b4 Disable sancov tests failing due to apparent endianness issues
Undoes some of the effect of r278271

llvm-svn: 278285
2016-08-10 20:11:35 +00:00
Reid Kleckner 0881472ac4 [sancov] Port sancov -print-coverage-pcs to COFF
The export table is not considered part of the object file symbol table,
so we have to look through it separately.

Reviewers: kcc

Differential Revision: https://reviews.llvm.org/D23321

llvm-svn: 278284
2016-08-10 20:08:19 +00:00
Jonathan Roelofs 851b79dc4d Fix UB in APInt::ashr
i64 -1, whose sign bit is the 0th one, can't be left shifted without invoking UB.

https://reviews.llvm.org/D23362

llvm-svn: 278280
2016-08-10 19:50:14 +00:00
Matt Arsenault 61f8ba8b79 AMDGPU: s_setpc_b64 should be an indirect branch
llvm-svn: 278278
2016-08-10 19:20:02 +00:00
Matt Arsenault c6b1350039 AMDGPU: Set sizes on control flow pseudos
llvm-svn: 278276
2016-08-10 19:11:51 +00:00
Matt Arsenault f4af802381 AMDGPU: Remove empty file comment
llvm-svn: 278275
2016-08-10 19:11:48 +00:00
Matt Arsenault 11587d97be AMDGPU: Remove unnecessary cast
llvm-svn: 278274
2016-08-10 19:11:45 +00:00
Matt Arsenault 57431c9680 AMDGPU: Change insertion point of si_mask_branch
Insert before the skip branch if one is created.
This is a somewhat more natural placement relative
to the skip branches, and makes it possible to implement
analyzeBranch for skip blocks.

The test changes are mostly due to a quirk where
the block label is not emitted if there is a terminator
that is not also a branch.

llvm-svn: 278273
2016-08-10 19:11:42 +00:00
Matt Arsenault b920e9987d AMDGPU: Use CreateStackObject instead of CreateSpillStackObject
I'm not sure what the difference is, but no other target
uses this for emergency spill slots.

llvm-svn: 278272
2016-08-10 19:11:36 +00:00
Reid Kleckner 260ac88cd4 [sancov] Run more sancov tests on non-x86-Linux machines
Add the $arch-registered-target features that clang uses to disable
tests that require a registered backend, so that we can run the sancov
tests on Windows. LLVM's lit suite did not appear to have a per-test way
to do this, and I would rather not split up the sancov tests into
architecture directories.

Split out of https://reviews.llvm.org/D23321

llvm-svn: 278271
2016-08-10 19:03:18 +00:00
Sanjay Patel 5ccc85fe83 [x86, AVX] allow FP vector select folding to bitwise logic ops (PR28895)
This handles the case in:
https://llvm.org/bugs/show_bug.cgi?id=28895

...but we are not getting all of the possibilities yet. 
Eg, we use 'X86::FANDN' for scalar FP select combines.

That enhancement is filed as:
https://llvm.org/bugs/show_bug.cgi?id=28925

Differential Revision: https://reviews.llvm.org/D23337

llvm-svn: 278270
2016-08-10 19:00:11 +00:00
Andrew Kaylor 498d3113c3 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18867

llvm-svn: 278269
2016-08-10 18:56:35 +00:00
Nicolai Haehnle 02d784172c LiveIntervalAnalysis: fix a crash in repairOldRegInRange
Summary:
See the new test case for one that was (non-deterministically) crashing
on trunk and deterministically hit the assertion that I added in D23302.
Basically, the machine function contains a sequence

     DS_WRITE_B32 %vreg4, %vreg14:sub0, ...
     DS_WRITE_B32 %vreg4, %vreg14:sub0, ...
     %vreg14:sub1<def> = COPY %vreg14:sub0

and SILoadStoreOptimizer::mergeWrite2Pair merges the two DS_WRITE_B32
instructions into one before calling repairIntervalsInRange.

Now repairIntervalsInRange wants to repair %vreg14, in particular, and
ends up trying to repair %vreg14:sub1 as well, but that only becomes
active _after_ the range that is to be repaired, hence the crash due
to LR.find(...) == LR.begin() at the start of repairOldRegInRange.

I believe that just skipping those subrange is fine, but again, not too
familiar with that code.

Reviewers: MatzeB, kparzysz, tstellarAMD

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D23303

llvm-svn: 278268
2016-08-10 18:51:14 +00:00
Andrew Kaylor b10f6876cd [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 278267
2016-08-10 18:47:19 +00:00
Krzysztof Parzyszek c9c2bba621 [Hexagon] Remove unused variants of LO/HI instructions
llvm-svn: 278266
2016-08-10 18:40:36 +00:00
Kyle Butt 71b1ca1be4 Codegen: Tail Merge: Be less aggressive with special cases.
This change makes it possible for tail-duplication and tail-merging to
be disjoint. By being less aggressive when merging during layout, there are no
overlapping cases between tail-duplication and tail-merging, provided the
thresholds are disjoint.

There is a remaining TODO to benchmark the succ_size() test for non-layout tail
merging.

llvm-svn: 278265
2016-08-10 18:36:18 +00:00
Simon Pilgrim 675c257a32 [X86][SSE] Dropped blend(insertps(x,y),zero) combine - this is now handled by target shuffle chain combining
llvm-svn: 278260
2016-08-10 18:10:29 +00:00
Tim Shen ca37f0f990 [ADT] Removed synthesized constructor introduced in r278251, since MSVC doesn't support them
llvm-svn: 278259
2016-08-10 18:08:38 +00:00
Matthias Braun c881d61314 TargetOpcodes: Rewrite the documentation for SUBREG_TO_REG
Differential Revision: https://reviews.llvm.org/D22708

llvm-svn: 278258
2016-08-10 18:05:50 +00:00
Krzysztof Parzyszek 0bbad0fc86 [Hexagon] Simplify the SplitConst32/64 pass
llvm-svn: 278256
2016-08-10 18:05:47 +00:00
Krzysztof Parzyszek 3b946c90ef [Hexagon] Add extra patterns for single-precision min/max instructions
llvm-svn: 278252
2016-08-10 17:56:24 +00:00
Tim Shen 64afe23528 [ADT] Add make_scope_exit().
Summary: make_scope_exit() is described in C++ proposal p0052r2, which uses RAII to do cleanup works at scope exit.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22796

llvm-svn: 278251
2016-08-10 17:52:09 +00:00
Rong Xu 63f970ee24 Fix LCSSA increased compile time
We are seeing r276077 drastically increasing compiler time for our larger
benchmarks in PGO profile generation build (both clang based and IR based
mode) -- it can be 20x slower than without the patch (like from 30 secs to
780 secs)

The increased time are all in pass LCSSA. The problematic code is about
PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater
is accumulating (never been cleared). Since the inserted PHIs are added to the
candidate for each rewrite, The earlier ones will be repeatedly added. Later
when adding the new PHIs to the work-list, we don't check the duplication
either. This can result in extremely long work-list that containing tons of
duplicated PHIs.

This patch fixes the issue by hoisting the code out of the loop.

Differential Revision: http://reviews.llvm.org/D23344

llvm-svn: 278250
2016-08-10 17:49:11 +00:00