Commit Graph

127869 Commits

Author SHA1 Message Date
Nemanja Ivanovic d58b976bb7 Fix for PR26690
I mistook BitVector::empty() to mean BitVector::count() == 0 and it does
not. Corrected the issue with the fix for PR26500.

llvm-svn: 261525
2016-02-22 14:47:49 +00:00
Benjamin Kramer 451f54cf62 Fix some abuse of auto flagged by clang's -Wrange-loop-analysis.
llvm-svn: 261524
2016-02-22 13:11:58 +00:00
Igor Breger 252c2d9680 AVX512F: Add assembler Intel syntax tests for knl, fix minor bugs.
Differential Revision: http://reviews.llvm.org/D17498

llvm-svn: 261521
2016-02-22 12:37:41 +00:00
Igor Breger 4511e76e5c AVX512: Fix scalar mem operands.
Differential Revision: http://reviews.llvm.org/D17500

llvm-svn: 261520
2016-02-22 11:48:27 +00:00
Elena Demikhovsky 9914dbd11b Allow setting MaxRerollIterations above 16
By Ayal Zaks.

Differential Revision http://reviews.llvm.org/D17258

llvm-svn: 261517
2016-02-22 09:38:28 +00:00
Craig Topper 84f2f18cfd [X86] Minor formatting fix. NFC
llvm-svn: 261515
2016-02-22 08:00:04 +00:00
Tobias Grosser 946ca0a946 Use EXPECT_EQ in the unittests instead of plain assert
This addresses post-review comments from Duncan P. N. Exon Smith to r261485.

llvm-svn: 261514
2016-02-22 07:20:40 +00:00
Duncan P. N. Exon Smith e59c8af705 Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261510, effectively reapplying r261509.  The
original commit missed a caller in AArch64ConditionalCompares.

Original commit message:

Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261511
2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith 0cc90a9147 Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261509.  I'm not sure how this compiled locally,
but something was out of whack.

llvm-svn: 261510
2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith 83d3476fd2 CodeGen: Use references in MachineTraceMetrics::Trace, NFC
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261509
2016-02-22 03:07:49 +00:00
Duncan P. N. Exon Smith 395bd9cd63 CodeGen: Explicitly convert from iterator to pointer, NFC
llvm-svn: 261508
2016-02-22 02:53:42 +00:00
Duncan P. N. Exon Smith d6de2a7612 Document assumption in X86FrameLowering::inlineStackProbe()
Resolve FIXME from r261504.  Apparently bundled instructions are illegal
here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160215/334146.html

llvm-svn: 261507
2016-02-22 02:32:35 +00:00
Kevin B. Smith 47e64abe1a [X86] More test updates to support fixup-byte-word-insts optimization
either on or off.
Differential Revisions: http://reviews.llvm.org/D17458

llvm-svn: 261505
2016-02-22 01:27:56 +00:00
Duncan P. N. Exon Smith dc0848c029 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Lang Hames e1fd99c197 [Orc] Add stack-realignment code to the i386 resolver function.
The resolver uses the fxsave/fxrstor instructions, which require 16-byte
alignment, to save SSE state to the stack. Since 16-byte alignment can't be
assumed on all OSes (and all i386 OSes share this function) - add code to
automatically bump the alignment to 16-bytes on entry to the function.

llvm-svn: 261503
2016-02-21 22:50:26 +00:00
Duncan P. N. Exon Smith f65e407c6e CodeGen: Split bundle_iterator into a separate file, NFC
Split MachineBasicBlock::bundle_iterator into a separate file, and
rename the class to MachineBundleIterator.

This is a precursor to adding a `MachineInstr::getBundleIterator()`
accessor, which will eventually let us delete the final call to
getNodePtrUnchecked(), and then remove the UB from ilist_iterator.

As a drive-by, I removed an unnecessary second template parameter.

llvm-svn: 261502
2016-02-21 22:05:50 +00:00
Duncan P. N. Exon Smith e5407d1ad8 CodeGen: Add constructor for MIBuilder from a bundle_iterator, NFC
Don't require explicit conversions for creating a MachineInstrBuilder
from a bundle_iterator.

llvm-svn: 261500
2016-02-21 21:15:37 +00:00
Duncan P. N. Exon Smith c10f961a6a ADT: Disallow == and != between pointers and ilist iterators
I completely missed these non-class operators when I removed the
implicit conversions in r252380.  Remove them now.  r261498 should have
already removed all uses.

Note (repeated from r252380): if you have out-of-tree code, it should be
fairly easy to revert this patch downstream while you update your
out-of-tree call sites.  Note that these conversions are occasionally
latent bugs (that may happen to "work" now, but only because of getting
lucky with UB; follow-ups will change your luck).  When they are valid,
I suggest using `->getIterator()` to go from pointer to iterator, and
`&*` to go from iterator to pointer.

llvm-svn: 261499
2016-02-21 20:46:37 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith ec6f7fed54 TransformUtils: Avoid getNodePtrUnchecked() in integer division, NFC
Stop relying on `getNodePtrUnchecked()` being useful on invalid
iterators.  This function is documented to be for internal use only, and
the pointer type will eventually have to change to remove UB from
ilist_iterator.  Instead, check the iterator before it has been
invalidated.

llvm-svn: 261497
2016-02-21 20:14:29 +00:00
Duncan P. N. Exon Smith a848c47130 ADT: Stop using getNodePtrUnchecked on end() iterators
Stop using `getNodePtrUnchecked()` when building IR.  Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.

This should have no functionality change for now, but is a path towards
removing some more UB from ilist.

llvm-svn: 261495
2016-02-21 19:52:15 +00:00
Craig Topper 4ab89ea187 [X86] Remove unused encoding types from disassembler. NFC
llvm-svn: 261494
2016-02-21 19:49:16 +00:00
Duncan P. N. Exon Smith 7b269642d2 CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway.  This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over.  But having this pointer available in
ilist_iterator depends on UB in the first place.

Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.

There should be no functionality change here.

llvm-svn: 261493
2016-02-21 19:37:45 +00:00
Duncan P. N. Exon Smith 1b936c0f59 ADT: clang-format ilist_iterator, NFC
Also removed a couple of noisy (no-value-added) comments.

llvm-svn: 261492
2016-02-21 19:26:08 +00:00
Duncan P. N. Exon Smith 44b6e7077b ADT: Remove ilist_iterator random access API, NFC
Remove explicitly deleted random access API from ilist_iterator.
Since it no longer has implicit conversions to a pointer type, we
no longer need this protection.

llvm-svn: 261491
2016-02-21 19:23:18 +00:00
Simon Pilgrim e9093adae0 [X86][AVX] Add shuffle masking support for EltsFromConsecutiveLoads
Add support for the case where we have a consecutive load (which must include the first + last elements) with a mixture of undef/zero elements. We load the vector and then apply a shuffle to clear the zero'd elements.

Differential Revision: http://reviews.llvm.org/D17297

llvm-svn: 261490
2016-02-21 19:15:48 +00:00
Tobias Grosser 934fcf4dc6 ScalerEvolution: Only erase temporary values if they actually have been added
This addresses post-review comments from Sanjoy Das for r261485.

llvm-svn: 261486
2016-02-21 18:50:09 +00:00
Tobias Grosser 11332e5ec5 ScalarEvolution: Do not keep temporary PHI values in ValueExprMap
Before this patch simplified SCEV expressions for PHI nodes were only returned
the very first time getSCEV() was called, but later calls to getSCEV always
returned the non-simplified value, which had "temporarily" been stored in the
ValueExprMap, but was never removed and consequently blocked the caching of the
simplified PHI expression.

llvm-svn: 261485
2016-02-21 17:42:10 +00:00
Sanjay Patel 2440130437 fix inaccurate comment; NFC
llvm-svn: 261484
2016-02-21 17:33:31 +00:00
Sanjay Patel 368ac5dbf7 [InstCombine] add getNegativeIsTrueBoolVec() helper function; NFC
Originally part of:
http://reviews.llvm.org/D17485

We need this when simplifying masked memory ops too.

llvm-svn: 261483
2016-02-21 17:29:33 +00:00
Sanjoy Das aa63dc0e9a Fix LLVM's handling and detection of skylake and cannonlake CPUs
Summary:
 - Rename `"skylake"` == SkylakeServerProc to `"skylake-avx512"`
 - Change `"skylake"` to denote SkylakeClientProc
 - Fix the detection of cpu family 6 and model 94 to be
   SkylakeClientProc instead of SkylakeServerProc
 - Remove the `"cnl"` for CannonLake

Reviewers: craig.topper, delena

Subscribers: zansari, echristo, qcolombet, RKSimon, spatel, DavidKreitzer, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17090

llvm-svn: 261482
2016-02-21 17:12:03 +00:00
Sanjoy Das 979a11d5b2 [LoopDeletion] Add an assert that verifies LCSSA
This is inspired by PR24804 -- had this assert been there before,
isolating the root cause for PR24804 would have been far easier.

llvm-svn: 261481
2016-02-21 17:11:59 +00:00
JF Bastien 55afdfd384 WebAssembly: update expected torture test failures
r261457 handles CopyToReg nodes with flag results in LowerCopyToReg, which was causing the SelectionDAGNodes assert.

llvm-svn: 261479
2016-02-21 16:52:00 +00:00
Simon Pilgrim b1cc4d6d69 [InstCombine] Added SSE41 roundss/roundsd demanded vector elements invec tests
llvm-svn: 261472
2016-02-21 14:50:27 +00:00
Simon Pilgrim e2ada8d3a7 [InstCombine] Added XOP frczss/vfrczsd demanded vector elements tests
llvm-svn: 261469
2016-02-21 12:45:36 +00:00
Simon Pilgrim 2ed8686a28 [InstCombine] Added SSE41 roundss/roundsd demanded vector elements tests
llvm-svn: 261468
2016-02-21 12:40:39 +00:00
Dan Gohman 27a11eefcc [WebAssembly] Support physical registers in the rewrite-to-discard optimization.
llvm-svn: 261465
2016-02-21 03:27:22 +00:00
Duncan P. N. Exon Smith b6452798a5 IR: Add ConstantData, for operand-less Constants
Add a common parent `ConstantData` to the constants that have no
operands.  These are guaranteed to represent abstract data that is in no
way tied to a specific Module.

This is a good cleanup on its own.  It also makes it simpler to disallow
RAUW (and factor away use-lists) on these constants in the future.  (I
have some experimental patches that make RAUW illegal on ConstantData,
and they seem to catch a bunch of bugs...)

llvm-svn: 261464
2016-02-21 02:39:49 +00:00
David Majnemer 78f46bea08 Unbreak non-X86 targets from fallout caused by r261462
llvm-svn: 261463
2016-02-21 01:40:04 +00:00
David Majnemer a3ea407d48 [X86] Use the correct alignment for COMDAT constant pool entries
COFF doesn't have sections with mergeable contents.  Instead, each
constant pool entry ends up in a COMDAT section.  The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections.  You just get whatever alignment was on the section.

If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.

Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.

This fixes PR26680.

llvm-svn: 261462
2016-02-21 01:30:30 +00:00
Simon Pilgrim 471efd244a [InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element
llvm-svn: 261460
2016-02-20 23:17:35 +00:00
Dan Gohman 3a71ccb607 [WebAssembly] Refine a README.txt entry.
The register coloring pass may also need to be involved in order to
optimally sort registers.

llvm-svn: 261458
2016-02-20 23:11:14 +00:00
Dan Gohman 02c0871abd [WebAssembly] Handle CopyToReg nodes with flag results in LowerCopyToReg.
llvm-svn: 261457
2016-02-20 23:09:44 +00:00
Simon Pilgrim 6a0dca535a [InstCombine] Added SSE/SSE2 comparison intrinsics demanded vector elements tests
llvm-svn: 261454
2016-02-20 22:41:31 +00:00
Derek Schuff 90dbb8cfc3 [WebAssembly] Write stack pointer back to memory when FP is used
The stack pointer is bumped when there is a frame pointer or when there
are static-size objects, but was only getting written back when there
were static-size objects.

llvm-svn: 261453
2016-02-20 22:18:47 +00:00
Derek Schuff dc5f6aa4bb [WebAssembly] Stackify function prologs and epilogs
The instructions are the same, but fewer locals are used.

Differential Revision: http://reviews.llvm.org/D17428

llvm-svn: 261452
2016-02-20 21:46:50 +00:00
Simon Pilgrim d4768fa314 [InstCombine] Added some SSE/SSE2 demanded vector elements tests
llvm-svn: 261451
2016-02-20 21:44:48 +00:00
Dan Gohman d1c5a3aa21 Don't scan for SSA register operands to update when not in SSA form.
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.

llvm-svn: 261450
2016-02-20 21:28:18 +00:00
Nemanja Ivanovic daf0ca2341 Fix the build bot break caused by rL261441.
The patch has a necessary call to a function inside an assert. Which is fine
when you have asserts turned on. Not so much when they're off. Sorry about
the regression.

llvm-svn: 261447
2016-02-20 20:45:37 +00:00
Simon Pilgrim d765c0b8b9 [X86][AVX] Added test case for PR22359
llvm-svn: 261444
2016-02-20 19:21:20 +00:00
Nemanja Ivanovic ae22101c55 Fix for PR 26500
This patch corresponds to review:
http://reviews.llvm.org/D17294

It ensures that whatever block we are emitting the prologue/epilogue into, we
have the necessary scratch registers. It takes away the hard-coded register
numbers for use as scratch registers as registers that are guaranteed to be
available in the function prologue/epilogue are not guaranteed to be available
within the function body. Since we shrink-wrap, the prologue/epilogue may end
up in the function body.

llvm-svn: 261441
2016-02-20 18:16:25 +00:00
Simon Pilgrim 79a14dd3d1 [X86] Regenerated pr16360.ll
llvm-svn: 261440
2016-02-20 17:56:45 +00:00
Simon Pilgrim 972d9fb76b [X86][SSE41] More fast-isel intrinsics tests
llvm-svn: 261439
2016-02-20 17:30:37 +00:00
Simon Pilgrim 19b3ce0f07 [X86][SSE41] Added fast-isel intrinsics tests
As discussed on PR24580, this patch adds some (more to come) initial fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse41-builtins.c

llvm-svn: 261438
2016-02-20 17:11:32 +00:00
Simon Pilgrim c5199aae82 [DAGCombiner] Use getBitcast helper when possible. NFCI.
llvm-svn: 261437
2016-02-20 15:05:29 +00:00
Simon Pilgrim ecb0433599 [X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles (PR26667)
Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input.

llvm-svn: 261434
2016-02-20 14:39:45 +00:00
Simon Pilgrim ccf2cce67c [X86][SSE] Move all undef/zero cases before target shuffle combining.
First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041).

llvm-svn: 261433
2016-02-20 12:57:32 +00:00
Joerg Sonnenberger 36894dcfed When MemoryDependenceAnalysis hits a CFG with many transparent blocks,
the algorithm easily degrades into quadratic memory and time complexity.
The easiest example is a long chain of BBs that don't otherwise use a
location. The caching will add an entry for every intermediate block and
limiting the number of results doesn't help as no results are produced
until a definition is found.

Introduce a limit similar to the existing instructions-per-block limit.
This limit counts the total number of blocks checked. If the limit is
reached, entries are considered unknown. The initial value is 1000,
which avoids regressions for normal sized functions while still
limiting edge cases to reasnable memory consumption and execution time.

Differential Revision: http://reviews.llvm.org/D16123

llvm-svn: 261430
2016-02-20 11:24:44 +00:00
Andrey Turetskiy 9994b8894a [X86] Enable the LEA optimization pass by default.
Differential Revision: http://reviews.llvm.org/D16877

llvm-svn: 261429
2016-02-20 11:11:55 +00:00
Andrey Turetskiy 0babd26626 [X86] PR26575: Fix LEA optimization pass (Part 2).
Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores.

Ref: https://llvm.org/bugs/show_bug.cgi?id=26575

Differential Revision: http://reviews.llvm.org/D17374

llvm-svn: 261428
2016-02-20 10:58:28 +00:00
Benjamin Kramer 7d537ae747 [SimplifyCFG] Use pointer identity to simplify predicate.
No functional change intended.

llvm-svn: 261427
2016-02-20 10:40:42 +00:00
Benjamin Kramer 2337c1fe13 [LVI] Move ConstantRanges instead of copying.
No functional change intended. Copying small (<= 64 bits) APInts isn't
expensive but bloats code by generating the slow path everywhere. Moving
doesn't care about the size of the value.

llvm-svn: 261426
2016-02-20 10:40:34 +00:00
David Majnemer 862c5ba302 Move some code from doInitialization to runOnFunction
This has no observable behavior change, it just makes the state
insertion pass look a little more like normal passes.

llvm-svn: 261420
2016-02-20 07:34:21 +00:00
Craig Topper f5ef3f9ce6 [X86] Remove some unused encoding checks from the disassembler table building.
llvm-svn: 261418
2016-02-20 06:20:21 +00:00
Craig Topper 2bf0c0394d [X86] Add some missing reversed forms of XOP instructions.
llvm-svn: 261417
2016-02-20 06:20:17 +00:00
Chandler Carruth c1dc384b54 [PM/AA] Wire up TBAA to the new pass manager's registry and test it.
llvm-svn: 261411
2016-02-20 04:04:52 +00:00
Chandler Carruth d6091a0344 [PM/AA] Wire up the scoped-no-alias AA to the new pass manager's
registry and test it.

llvm-svn: 261410
2016-02-20 04:03:06 +00:00
Chandler Carruth 2b3d0446f4 [PM/AA] Wire up SCEVAA to the new pass manager's registry and test it.
llvm-svn: 261409
2016-02-20 04:01:45 +00:00
Matthias Braun c65e904be8 MachineCopyPropagation: Introduce Reg2MIMap typedef; NFC
llvm-svn: 261408
2016-02-20 03:56:41 +00:00
Matthias Braun bd18d751de MachineCopyPropagation: Move variables from function to pass
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.

llvm-svn: 261407
2016-02-20 03:56:39 +00:00
Matthias Braun 273575dcbe MachineCopyPropagation: Use ranged for, cleanup; NFC
llvm-svn: 261406
2016-02-20 03:56:36 +00:00
Matthias Braun 57b5f11aa7 MachineCopyPropagation: Use assert() instead of if{report_error()} for 'impossible' condition
llvm-svn: 261405
2016-02-20 03:56:33 +00:00
Chandler Carruth 342c671b66 [PM/AA] Wire up CFLAA to the new pass manager fully, and port one of its
tests over to exercise this code.

This uncovered a few missing bits here and there in the analysis, but
nothing interesting.

llvm-svn: 261404
2016-02-20 03:52:02 +00:00
Chandler Carruth 4f846a5f15 [PM/AA] Port alias analysis evaluator to the new pass manager, and use
it to actually test the new pass manager AA wiring.

This patch was extracted from the (somewhat too large) D12357 and
rebosed on top of the slightly different design of the new pass manager
AA wiring that I just landed. With this we can start testing the AA in
a thorough way with the new pass manager.

Some minor cleanups to the code in the pass was necessitated here, but
otherwise it is a very minimal change.

Differential Revision: http://reviews.llvm.org/D17372

llvm-svn: 261403
2016-02-20 03:46:03 +00:00
Mike Aizatsky 82657cf01a fixing msvc warning.
llvm-svn: 261396
2016-02-20 02:11:49 +00:00
Sanjoy Das a809db7c29 [SCEV] Don't spell `SCEV *` variables as `Scev`; NFC
I missed a spot in rL261393.

llvm-svn: 261395
2016-02-20 01:59:15 +00:00
Sanjoy Das 807d33da96 [SCEV] Don't spell `SCEV *` variables as `Scev`; NFC
It reads odd since most other places name a `SCEV *` as `S`.  Pure
renaming change.

llvm-svn: 261393
2016-02-20 01:44:10 +00:00
Sanjoy Das c42f7cc3f8 [SCEV] Don't use std::make_pair; NFC
`{A, B}` reads cleaner than `std::make_pair(A, B)`.

llvm-svn: 261392
2016-02-20 01:35:56 +00:00
David Majnemer 1efa23ddab [SimplifyCFG] Merge together cleanuppads
Cleanuppads may be merged together if one is the only predecessor of the
other in which case a simple transform can be performed: replace the
a cleanupret with a branch and remove an unnecessary cleanuppad.

Differential Revision: http://reviews.llvm.org/D17459

llvm-svn: 261390
2016-02-20 01:07:45 +00:00
Davide Italiano 228978c0dc [X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled.
TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent
shrink-wrapping from pushing prologue/epilogue past them (which result
in TLS variables being accessed before the stack frame is set up), we 
put markers, so that the stack gets adjusted properly.
Thanks to Quentin Colombet for guidance/help on how to fix this problem!

llvm-svn: 261387
2016-02-20 00:44:47 +00:00
Tom Stellard 467b5b9024 AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer
Summary:
Instead of trying to replace SMRD instructions with a VGPR base pointer
with an equivalent MUBUF instruction, we now copy the base pointer to
SGPRs using v_readfirstlane.

This is safe to do, because any load selected as an SMRD instruction
has been proven to have a uniform base pointer, so each thread in the
wave will have the same pointer value in VGPRs.

This will fix some errors on VI from trying to replace SMRD instructions
with addr64-enabled MUBUF instructions that don't exist.

Reviewers: arsenm, cfang, nhaehnle

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17305

llvm-svn: 261385
2016-02-20 00:37:25 +00:00
Quentin Colombet e611698e84 [RegAllocFast] Properly track the physical register definitions on calls.
PR26485

llvm-svn: 261384
2016-02-20 00:32:29 +00:00
Reid Kleckner 344078f51f [codeview] Fix emission of file changes in inline line tables
These are supposed to be file checksum table offsets, not file ids.

llvm-svn: 261379
2016-02-19 23:55:38 +00:00
Mike Aizatsky b28eef39f7 [sancov] sanitizer html report cosmetic improvements.
llvm-svn: 261375
2016-02-19 22:55:08 +00:00
Davide Italiano a8f1f2efaf [X86ISelLowering] Provide a more informative assert message.
I stumbled upon this while debugging a lowering bug.

llvm-svn: 261371
2016-02-19 22:18:49 +00:00
Davide Italiano 4cfe2a9e38 [X86ISelLowering] Merge two conditions inside a single if.
llvm-svn: 261370
2016-02-19 22:01:07 +00:00
Hans Wennborg a0f7090563 Revert r255691 "[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions."
It caused PR26509.

llvm-svn: 261368
2016-02-19 21:40:12 +00:00
Hans Wennborg 7c3077ca52 Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky"
Turns out the new nop sequences aren't actually nops on x86_64 (PR26554).

llvm-svn: 261365
2016-02-19 21:26:31 +00:00
David Blaikie 852c02baf9 llvm-dwp: Improve performance (N^2 to amortized N) by using a MapVector instead of linear searches through a vector
Figured this would be a problem, but didn't want to jump the gun - large
inputs demonstrate it pretty easily (mostly for type units, but might as
well do the same for CUs too). A random sample 6m27s -> 27s change.

Also, by checking this up-front for CUs (rather than when building the
cu_index) we can probably provide better error messages (see FIXMEs),
hopefully providing the name of the CUs rather than just their
signature.

llvm-svn: 261364
2016-02-19 21:09:26 +00:00
Dimitry Andric db417b6d40 Fix incorrect selection of AVX512 sqrt when OptForSize is on
Summary:
When optimizing for size, sqrt calls can be incorrectly selected as
AVX512 VSQRT instructions.  This is because X86InstrAVX512.td has a
`Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass
definition.  Even if the target does not support AVX512, the class can
apparently still be chosen, leading to an incorrect selection of
`vsqrtss`.

In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <=
X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction
requires an XMM register, which is not available on i686 CPUs.

Reviewers: grosbach, resistor, joker.eph

Subscribers: spatel, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D17414

llvm-svn: 261360
2016-02-19 20:14:11 +00:00
Sanjoy Das ffb7bd11f7 [StatepointLowering] Minor non-semantic cleanups
Use auto, bring file up to coding standards etc.

llvm-svn: 261358
2016-02-19 19:37:07 +00:00
Dan Gohman 87e368b7db [WebAssembly] Add another optimization idea to README.txt.
llvm-svn: 261354
2016-02-19 19:22:44 +00:00
Geoff Berry 7e4ba3dc02 [AArch64][ShrinkWrap] Fix bug in prolog clobbering live reg when shrink wrapping.
Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642

Reviewers: qcolombet, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17350

llvm-svn: 261349
2016-02-19 18:27:32 +00:00
Sanjoy Das f6fee29ceb [StatepointLowering] Update StatepointMaxSlotsRequired correctly
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct.  Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().

llvm-svn: 261348
2016-02-19 18:15:56 +00:00
Sanjoy Das e8019df552 [StatepointLowering] Fix a mistake in rL261336
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots.  Weirdly, the tests
I added in rL261336 didn't catch this.

llvm-svn: 261347
2016-02-19 18:15:53 +00:00
Matthew Simpson 29c997c1a1 [LV] Vectorize first-order recurrences
This patch enables the vectorization of first-order recurrences. A first-order
recurrence is a non-reduction recurrence relation in which the value of the
recurrence in the current loop iteration equals a value defined in the previous
iteration. The load PRE of the GVN pass often creates these recurrences by
hoisting loads from within loops.

In this patch, we add a new recurrence kind for first-order phi nodes and
attempt to vectorize them if possible. Vectorization is performed by shuffling
the values for the current and previous iterations. The vectorization cost
estimate is updated to account for the added shuffle instruction.

Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org>
Differential Revision: http://reviews.llvm.org/D16197

llvm-svn: 261346
2016-02-19 17:56:08 +00:00
Sanjoy Das 171313c69a [StatepointLowering] Change AllocatedStackSlots to use SmallBitVector
NFCI.  They key motivation here is that I'd like to use
SmallBitVector::all() in a later change.  Also, using a bit vector here
seemed better in general.

The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots.  As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.

Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.

llvm-svn: 261337
2016-02-19 17:15:26 +00:00
Sanjoy Das d2db73ba59 [StatepointLowering] Fix bug in allocateStackSlot
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot.  This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.

While this change fixes the bug pointed out, it has two performance
caveats:

 - It matches spill slot and spillee size exactly, while in theory we
   can spill, e.g., an 8 byte pointer into a 16 byte slot.  This is
   slightly complicated to fix since in the stackmaps section, we report
   the size of the spill slot as the size of the "indirect value"; and
   if they're no longer equivalent, we'll have to keep track of the
   (indirect) value size separately from the stack slot size.

 - It will "spuriously run out" of reusable slots, since we now have an
   second check in the search loop in addition to the availablity
   check (e.g. you had two free scalar slots, and you first ask for a
   vector slot followed by a scalar slot).  I'll fix this in a later
   commit.

llvm-svn: 261336
2016-02-19 17:15:22 +00:00
Sanjoy Das 7b2e91fb59 [StatepointLowering] Clean up allocateStackSlot
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward.  I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.

llvm-svn: 261335
2016-02-19 17:15:17 +00:00
Kevin B. Smith 652128d48c [X86] Change fixup-bw-inst.ll to test output with this optimization on and off.
Differential Revision: http://reviews.llvm.org/D17415

llvm-svn: 261332
2016-02-19 16:20:48 +00:00