Commit Graph

114246 Commits

Author SHA1 Message Date
Simon Pilgrim 2a9cde026c [X86][AVX] Reduce v4f64/v4i64 shuffle costs (PR37882)
These were being over cautious for costs for one/two op general shuffles - VSHUFPD doesn't have to replicate the same shuffle in both lanes like VSHUFPS does. 

llvm-svn: 335216
2018-06-21 11:37:13 +00:00
Mikael Holmen 42f7bc96dd [DebugInfo] Make sure all DBG_VALUEs' reguse operands have IsDebug property
Summary:
In some cases, these operands lacked the IsDebug property, which is meant to signal that
they should not affect codegen. This patch adds a check for this property in the
MachineVerifier and adds it where it was missing.

This includes refactorings to use MachineInstrBuilder construction functions instead of
manually setting up the intrinsic everywhere.

Patch by: JesperAntonsson

Reviewers: aprantl, rnk, echristo, javed.absar

Reviewed By: aprantl

Subscribers: qcolombet, sdardis, nemanjai, JDevlieghere, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D48319

llvm-svn: 335214
2018-06-21 10:03:34 +00:00
David Green a465188500 [DAGCombine] Fix alignment for offset loads/stores
The alignment parameter to getExtLoad is treated as a base alignment,
not the alignment of the load (base + offset). When we infer a better
alignment for a Ptr we need to ensure that it applies to the base to
prevent the alignment on the load from being wrong.

This fixes a bug where the alignment could then be used to incorrectly
prove noalias between a load and a store, leading to a miscompile.

Differential Revision: https://reviews.llvm.org/D48029

llvm-svn: 335210
2018-06-21 08:30:07 +00:00
Eric Christopher 12e221d9b9 Remove FIXME comment about WIP.
This is the only line other than the function signature remaining
of the original patch.

llvm-svn: 335208
2018-06-21 07:15:19 +00:00
Eric Christopher 76cfef46f3 Add some explanatory text to the associated symbol support.
llvm-svn: 335207
2018-06-21 07:15:14 +00:00
Florian Hahn d36aa1f763 Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp instructions.
r335150 should resolve the issues with the clang-with-thin-lto-ubuntu
and clang-with-lto-ubuntu builders.

Original message:
This patch updates IPSCCP to use PredicateInfo to propagate
facts to true branches predicated by EQ and to false branches
predicated by NE.

As a follow up, we should be able to extend it to also propagate additional
facts about nonnull.

Reviewers: davide, mssimpso, dberlin, efriedma

Reviewed By: davide, dberlin

llvm-svn: 335206
2018-06-21 07:15:08 +00:00
Mikael Holmen 57b33f6aac [DebugInfo] Keep DBG_VALUE undef in LiveDebugVariables
Summary:
Fixes PR36579.

For cases where we had e.g.

 DBG_VALUE 42
 [...]
 DBG_VALUE undef

LiveDebugVariables would discard all undef DBG_VALUEs and then it would
look like the variable had the value 42 throughout the rest of the
function, which is incorrect.

With this patch we don't remove all undef DBG_VALUEs in LiveDebugVariables
so they will be kept after register allocation just like other DBG_VALUEs
which will yield more correct debug information.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: bjope, Ka-Ka, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48277

llvm-svn: 335205
2018-06-21 07:02:46 +00:00
Chandler Carruth d1dab0c3c0 [PM/LoopUnswitch] Add partial non-trivial unswitching for invariant
conditions feeding a chain of `and`s or `or`s for a branch.

Much like with full non-trivial unswitching, we rely on the pass manager
to handle iterating until all of the profitable unswitches have been
done. This is to allow other more profitable unswitches to fire on any
of the cloned, simpler versions of the loop if viable.

Threading the partial unswiching through the non-trivial unswitching
logic motivated some minor refactorings. If those are too disruptive to
make it reasonable to review this patch, I can separate them out, but
it'll be somewhat timeconsuming so I wanted to send it for initial
review as-is. Feel free to tell me whether it warrants pulling apart.

I've tried to re-use (and factor out) logic form the partial trivial
unswitching, but not as much could be shared as I had haped. Still, this
wasn't as bad as I naively expected.

Some basic testing is added, but I probably need more. Suggestions for
things you'd like to see tested more than welcome. One thing I'd like to
do is add some testing that when we schedule this with loop-instsimplify
it effectively cleans up the cruft created.

Last but not least, this uncovered a bug that has been in loop cloning
the entire time for non-trivial unswitching. Specifically, we didn't
correctly add the outer-most cloned loop to the list of cloned loops.
This meant that LCSSA wouldn't be updated for it hypothetically, and
more significantly that we would never visit it in the loop pass
manager. I noticed this while checking loop-instsimplify by hand. I'll
try to separate this bugfix out into its own patch with a more focused
test. But it is just one line, so shouldn't significantly confuse the
review here.

After this patch, the only missing "feature" in this unswitch I'm aware
of us non-trivial unswitching of switches. I'll try implementing *full*
non-trivial unswitching of switches (which is at least a sound thing to
implement), but *partial* non-trivial unswitching of switches is
something I don't see any sound and principled way to implement. I also
have no interesting test cases for the latter, so I'm not really
worried. The rest of the things that need to be ported are bug-fixes and
more narrow / targeted support for specific issues.

Differential Revision: https://reviews.llvm.org/D47522

llvm-svn: 335203
2018-06-21 06:14:03 +00:00
Michael Zolotukhin 336d75cc73 ProvenanceAnalysis: Store WeakTrackingVH instead of Value* in UnderlyingValue Cache.
Summary:
Since the value stored in the cache might be deleted or replaced with
something else, we need to use tracking ValueHandlers instead of plain
Value pointers. It was discovered in one of internal builds, and
unfortunately there is no small reproducer for the issue.

The cache was introduced in rL327328.

Reviewers: ahatanak, pete

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48407

llvm-svn: 335201
2018-06-21 05:14:00 +00:00
Craig Topper 296526bf46 [X86] Remove masking from 512-bit floating max/min intrinsics. Use select instruction instead.
llvm-svn: 335199
2018-06-21 05:00:56 +00:00
Tim Shen 433b9761ce Revert "[SCEV] Improve zext(A /u B) and zext(A % B)"
This reverts commit r335197, as some bots are not happy.

llvm-svn: 335198
2018-06-21 02:15:32 +00:00
Tim Shen 5af61e0a28 [SCEV] Improve zext(A /u B) and zext(A % B)
Summary:
Try to match udiv and urem patterns, and sink zext down to the leaves.

I'm not entirely sure why some unrelated tests change, but the added <nsw>s seem right.

Reviewers: sanjoy

Subscribers: jlebar, hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D48338

llvm-svn: 335197
2018-06-21 01:49:07 +00:00
Wolfgang Pieb 61d8c8d9b3 [DWARF] Improved error reporting for range lists.
Errors found processing the DW_AT_ranges attribute are propagated by lower level 
routines and reported by their callers.

Reviewer: JDevlieghere

Differential Revision: https://reviews.llvm.org/D48344

llvm-svn: 335188
2018-06-20 22:56:37 +00:00
Simon Dardis 0f111dd704 [mips] Add microMIPS specific addressing patterns.
These are identical but use microMIPS instructions instead of MIPS instructions.

Also, flatten the 'let AdditionalPredicates = [InMicroMips]' by using the
ISA_MICROMIPS adjective. Add tests for constant materialization.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D48275

llvm-svn: 335185
2018-06-20 22:40:12 +00:00
Alina Sbirlea dfd14adeb0 Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred.
Summary:
Two utils methods have essentially the same functionality. This is an attempt to merge them into one.
1. lib/Transforms/Utils/Local.cpp : MergeBasicBlockIntoOnlyPred
2. lib/Transforms/Utils/BasicBlockUtils.cpp : MergeBlockIntoPredecessor

Prior to the patch:
1. MergeBasicBlockIntoOnlyPred
Updates either DomTree or DeferredDominance
Moves all instructions from Pred to BB, deletes Pred
Asserts BB has single predecessor
If address was taken, replace the block address with constant 1 (?)

2. MergeBlockIntoPredecessor
Updates DomTree, LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

After the patch:
Method 2. MergeBlockIntoPredecessor is attempting to become the new default:
Updates DomTree or DeferredDominance, and LoopInfo and MemoryDependenceResults
Moves all instruction from BB to Pred, deletes BB
Returns if doesn't have a single predecessor
Returns if BB's address was taken

Uses of MergeBasicBlockIntoOnlyPred that need to be replaced:

1. lib/Transforms/Scalar/LoopSimplifyCFG.cpp
Updated in this patch. No challenges.

2. lib/CodeGen/CodeGenPrepare.cpp
Updated in this patch.
  i. eliminateFallThrough is straightforward, but I added using a temporary array to avoid the iterator invalidation.
  ii. eliminateMostlyEmptyBlock(s) methods also now use a temporary array for blocks
Some interesting aspects:
  - Since Pred is not deleted (BB is), the entry block does not need updating.
  - The entry block was being updated with the deleted block in eliminateMostlyEmptyBlock. Added assert to make obvious that BB=SinglePred.
  - isMergingEmptyBlockProfitable assumes BB is the one to be deleted.
  - eliminateMostlyEmptyBlock(BB) does not delete BB on one path, it deletes its unique predecessor instead.
  - adding some test owner as subscribers for the interesting tests modified:
    test/CodeGen/X86/avx-cmp.ll
    test/CodeGen/AMDGPU/nested-loop-conditions.ll
    test/CodeGen/AMDGPU/si-annotate-cf.ll
    test/CodeGen/X86/hoist-spill.ll
    test/CodeGen/X86/2006-11-17-IllegalMove.ll

3. lib/Transforms/Scalar/JumpThreading.cpp
Not covered in this patch. It is the only use case using the DeferredDominance.
I would defer to Brian Rzycki to make this replacement.

Reviewers: chandlerc, spatel, davide, brzycki, bkramer, javed.absar

Subscribers: qcolombet, sanjoy, nemanjai, nhaehnle, jlebar, tpr, kbarton, RKSimon, wmi, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D48202

llvm-svn: 335183
2018-06-20 22:01:04 +00:00
Alina Sbirlea 201d02c75c [MemorySSA] Verify Phi incoming blocks are block predecessors.
Summary: Make the MemorySSA verify also check that all Phi incoming blocks are block predecessors.

Reviewers: george.burgess.iv

Subscribers: sanjoy, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D48333

llvm-svn: 335174
2018-06-20 21:06:13 +00:00
Craig Topper c2696d577b [X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to isel
I don't believe there is any real reason to have separate X86 specific opcodes for vector compares. Setcc has the same behavior just uses a different encoding for the condition code.

I had to change the CondCodeAction for SETLT and SETLE to prevent some transforms from changing SETGT lowering.

Differential Revision: https://reviews.llvm.org/D43608

llvm-svn: 335173
2018-06-20 21:05:02 +00:00
Simon Pilgrim 3d1c8c97b8 [SLPVectorizer] Provide InstructionsState down the BoUpSLP vectorization call tree
As described in D48359, this patch pushes InstructionsState down the BoUpSLP call hierarchy instead of the corresponding raw OpValue. This makes it easier to track the alternate opcode etc. and avoids us having to call getAltOpcode which makes it difficult to support more than one alternate opcode.

Differential Revision: https://reviews.llvm.org/D48382

llvm-svn: 335170
2018-06-20 20:54:52 +00:00
Stanislav Mekhanoshin 20279dc025 Allow binop C1, (select cc, CF, CT) -> select folding
Previously this folding was done only if select is a first operand.
However, for non-commutative operations constant may go before
select.

Differential Revision: https://reviews.llvm.org/D48223

llvm-svn: 335167
2018-06-20 20:24:20 +00:00
Simon Dardis 6021424c10 [mips] Correct predicates for loads, bit manipulation instructions and some pseudos
Additionally, correct the definition of the rdhwr instruction.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D48216

llvm-svn: 335162
2018-06-20 19:59:58 +00:00
Matt Arsenault 5a4ec8127f AMDGPU: Fix scalar_to_vector for v4i16/v4f16
llvm-svn: 335161
2018-06-20 19:45:48 +00:00
Matt Arsenault 3d06668ad4 AMDGPU: Fix missing C++ mode comment
llvm-svn: 335160
2018-06-20 19:45:40 +00:00
Sanjay Patel 3597588493 [IR] add/use isIntDivRem convenience function
There are more existing potential users of this,
but I've limited this patch to the first couple
that I found to minimize typo risk.

llvm-svn: 335157
2018-06-20 19:02:17 +00:00
Chandler Carruth 4da3331d3d [PM/LoopUnswitch] Support partial trivial unswitching.
The idea of partial unswitching is to take a *part* of a branch's
condition that is loop invariant and just unswitching that part. This
primarily makes sense with i1 conditions of branches as opposed to
switches. When dealing with i1 conditions, we can easily extract loop
invariant inputs to a a branch and unswitch them to test them entirely
outside the loop.

As part of this, we now create much more significant cruft in the loop
body, so this relies on adding cleanup passes to the loop pipeline and
revisiting unswitched loops to do that cleanup before continuing to
process them.

This already appears to be more powerful at unswitching than the old
loop unswitch pass, and so I'd appreciate pretty careful review in case
I'm just missing some correctness checks. The `LIV-loop-condition` test
case is not unswitched by the old unswitch pass, but is with this pass.

Thanks to Sanjoy and Fedor for the review!

Differential Revision: https://reviews.llvm.org/D46706

llvm-svn: 335156
2018-06-20 18:57:07 +00:00
Alex Bradbury fafdebcfcb [RISCV] Accept fmv.s.x and fmv.x.s as mnemonic aliases for fmv.w.x and fmv.x.w
These instructions were renamed in version 2.2 of the user-level ISA spec, but 
the old name should also be accepted by standard tools.

llvm-svn: 335154
2018-06-20 18:42:25 +00:00
Vedant Kumar 4e93f3dcf8 [Local] Generalize insertReplacementDbgValues, NFC
This utility should operate on Values, not Instructions. While I'm here,
I've also made it possible to skip emitting replacement dbg.values for
certain debug users (by having RewriteExpr return nullptr).

llvm-svn: 335152
2018-06-20 18:40:14 +00:00
Florian Hahn 5ac2629823 [PredicateInfo] Order instructions in different BBs by DFSNumIn.
Using OrderedInstructions::dominates as comparator for instructions in
BBs without dominance relation can cause a non-deterministic order
between such instructions. That in turn can cause us to materialize
copies in a non-deterministic order. While this does not effect
correctness, it causes some minor non-determinism in the final generated
code, because values have slightly different labels.

Without this patch, running -print-predicateinfo on a reasonably large
module produces slightly different output on each run.

This patch uses the dominator trees DFSInNum to order instruction from
different BBs, which should enforce a deterministic ordering and
guarantee that dominated instructions come after the instructions that
dominate them.

Reviewers: dberlin, efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D48230

llvm-svn: 335150
2018-06-20 17:42:01 +00:00
Paul Robinson 8e3e374e5f [DWARF] Don't keep a ref to possibly stack allocated data.
llvm-svn: 335146
2018-06-20 17:08:46 +00:00
Vlad Tsyrklevich 34b9112216 IRMover: Account for matching types present across modules
Summary:
Due to uniqueing of DICompositeTypes, it's possible for a type from one
module to be loaded into another earlier module without being renamed.
Then when the defining module is being IRMoved, the type can be used as
a Mapping destination before being loaded, such that when it's requested
using TypeMapTy::get() it will fail with an assertion that the type is a
source type when it's actually a type in both the source and
destination modules. Correctly handle that case by allowing a non-opaque
non-literal struct type be present in both modules.

Fix for PR37684.

Reviewers: pcc, tejohnson

Reviewed By: pcc, tejohnson

Subscribers: tobiasvk, mehdi_amini, steven_wu, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D47898

llvm-svn: 335145
2018-06-20 16:50:56 +00:00
Vedant Kumar 6fa24b0b7f [Local] Add a utility to insert replacement dbg.values, NFC
The purpose of this utility is to make it easier for optimizations to
insert replacement dbg.values for instructions they are deleting. This
is useful in situations where salvageDebugInfo is inapplicable, say,
because the new dbg.value cannot refer to an operand of the dying value.

The utility is called insertReplacementDbgValues.

It assumes that the instruction 'From' is going to be deleted, and
inserts replacement dbg.values for each debug user of 'From'. The
newly-inserted dbg.values refer to 'To' instead of 'From'. Each
replacement dbg.value has the same location and variable as the debug
user it replaces, has a DIExpression determined by the result of
'RewriteExpr' applied to an old debug user of 'From', and is placed
before 'InsertBefore'.

This should simplify future patches, like D48331.

llvm-svn: 335144
2018-06-20 16:50:25 +00:00
Paul Robinson 37fbb92652 Remove a redundant initialization. NFC
llvm-svn: 335143
2018-06-20 16:12:03 +00:00
Simon Pilgrim 292651a5b7 [SLPVectorizer] Move isOneOf after InstructionsState type. NFCI.
A future patch will have isOneOf use InstructionsState.

llvm-svn: 335142
2018-06-20 16:11:00 +00:00
Bjorn Pettersson 7bf676662a [DAG] Don't map a TableId to itself in the ReplacedValues map
Summary:
Found some regressions (infinite loop in DAGTypeLegalizer::RemapId)
after r334880. This patch makes sure that we do map a TableId to
itself.

Reviewers: niravd

Reviewed By: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48364

llvm-svn: 335141
2018-06-20 16:06:09 +00:00
Nirav Dave cd558887d3 [DAG] Fix and-mask folding when narrowing loads.
Summary:
Check that and masks are strictly smaller than implicit mask from
narrowed load.

Fixes PR37820.

Reviewers: samparker, RKSimon, nemanjai

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D48335

llvm-svn: 335137
2018-06-20 15:36:29 +00:00
Sam Clegg 52564675c4 [WebAssembly] Update know failures for the wasm waterfall
Summary:
The waterfall no longer builds .s files and no longers uses
the wasm-o when it builds object files.

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D48371

llvm-svn: 335135
2018-06-20 15:17:12 +00:00
Simon Pilgrim 0c9f8dcde7 [SLPVectorizer] Use InstructionsState to record AltOpcode
This is part of a move towards generalizing the alternate opcode mechanism and not just supporting (F)Add/(F)Sub counterparts.

The patch embeds the AltOpcode in the InstructionsState instead of calling getAltOpcode so often.

I'm hoping to eventually remove all uses of getAltOpcode and handle alternate opcode selection entirely within getSameOpcode, that will require us to use InstructionsState throughout the BoUpSLP call hierarchy (similar to some of the changes in D28907), which I will begin in future patches.

Differential Revision: https://reviews.llvm.org/D48359

llvm-svn: 335134
2018-06-20 15:13:40 +00:00
Simon Pilgrim 2e2f20a949 [SLPVectorizer] Relax "alternate" opcode vectorisation to work with any SK_Select shuffle pattern
D47985 saw the old SK_Alternate 'alternating' shuffle mask replaced with the SK_Select mask which accepts either input operand for each lane, equivalent to a vector select with a constant condition operand.

This patch updates SLPVectorizer to make full use of this SK_Select shuffle pattern by removing the 'isOdd()' limitation.

The AArch64 regression will be fixed by D48172.

Differential Revision: https://reviews.llvm.org/D48174

llvm-svn: 335130
2018-06-20 14:26:28 +00:00
Sanjay Patel 0c57de4c21 [InstSimplify] Fix missed optimization in simplifyUnsignedRangeCheck()
For both operands are unsigned, the following optimizations are valid, and missing:

   1. X > Y && X != 0 --> X > Y
   2. X > Y || X != 0 --> X != 0
   3. X <= Y || X != 0 --> true
   4. X <= Y || X == 0 --> X <= Y
   5. X > Y && X == 0 --> false

unsigned foo(unsigned x, unsigned y) { return x > y && x != 0; }
should fold to x > y, but I found we haven't done it right now.
besides, unsigned foo(unsigned x, unsigned y) { return x < y && y != 0; }
Has been folded to x < y, so there may be a bug.

Patch by: Li Jia He!

Differential Revision: https://reviews.llvm.org/D47922

llvm-svn: 335129
2018-06-20 14:22:49 +00:00
Alex Bradbury 79d2b50ca8 [RISCV] Add InstAlias definitions for fgt.{s|d}, fge.{s|d}
These are produced by GCC and supported by GAS, but not currently contained in 
the pseudoinstruction listing in the RISC-V ISA manual.

llvm-svn: 335127
2018-06-20 14:03:02 +00:00
Krzysztof Parzyszek d8b780dcd6 [Hexagon] Remove 'T' from HasVNN predicates, NFC
Patch by Sumanth Gundapaneni.

llvm-svn: 335124
2018-06-20 13:56:09 +00:00
Simon Dardis eae99120b0 [mips] Fix the predicates of some DSP instructions from AdditionalPredicates to ASEPredicate
Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D48166

llvm-svn: 335122
2018-06-20 13:29:57 +00:00
Sanjay Patel 825a4faa8d [InstCombine] ignore debuginfo when removing redundant assumes (PR37726)
This is similar to:
rL335083

Fixes::
https://bugs.llvm.org/show_bug.cgi?id=37726

llvm-svn: 335121
2018-06-20 13:22:26 +00:00
Alex Bradbury 18b9bd7d6c [RISCV] Add InstAlias definitions for sgt and sgtu
These are produced by GCC and supported by GAS, but not currently contained in 
the pseudoinstruction listing in the RISC-V ISA manual.

llvm-svn: 335120
2018-06-20 12:54:02 +00:00
Tim Northover 644a819534 ARM: convert ORR instructions to ADD where possible on Thumb.
Thumb has more 16-bit encoding space dedicated to ADD than ORR, allowing both a
3-address encoding and a wider range of immediates. So, particularly when
optimizing for code size (but it doesn't make things worse elsewhere) it's
beneficial to select an OR operation to an ADD if we know overflow won't occur.

This is made even better by LLVM's penchant for putting operations in canonical
form by converting the other way.

llvm-svn: 335119
2018-06-20 12:09:44 +00:00
Tim Northover 70666e7765 [AArch64] Implement FLT_ROUNDS macro.
Very similar to ARM implementation, just maps to an MRS.

Should fix PR25191.

Patch by Michael Brase.

llvm-svn: 335118
2018-06-20 12:09:01 +00:00
Andrea Di Biagio 2145b13fc9 [llvm-mca][X86] Teach how to identify register writes that implicitly clear the upper portion of a super-register.
This patch teaches llvm-mca how to identify register writes that implicitly zero
the upper portion of a super-register.

On X86-64, a general purpose register is implemented in hardware as a 64-bit
register. Quoting the Intel 64 Software Developer's Manual: "an update to the
lower 32 bits of a 64 bit integer register is architecturally defined to zero
extend the upper 32 bits".  Also, a write to an XMM register performed by an AVX
instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.

This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis
interface to help identify instructions that implicitly clear the upper portion
of a super-register.  The rest of the patch teaches llvm-mca how to use that new
method to obtain the information, and update the register dependencies
accordingly.

I compared the kernels from tests clear-super-register-1.s and
clear-super-register-2.s against the output from perf on btver2.  Previously
there was a large discrepancy between the estimated IPC and the measured IPC.
Now the differences are mostly in the noise.

Differential Revision: https://reviews.llvm.org/D48225

llvm-svn: 335113
2018-06-20 10:08:11 +00:00
Simon Pilgrim b7ac037797 [SLPVectorizer] Split Tree/Reduction cost calls to simplify debugging. NFCI.
llvm-svn: 335110
2018-06-20 09:39:01 +00:00
Roman Lebedev 42a1ff11fb [NFC][SCEV] Add tests related to bit masking (PR37793)
Summary:
Related to https://bugs.llvm.org/show_bug.cgi?id=37793, https://reviews.llvm.org/D46760#1127287

We'd like to do this canonicalization https://rise4fun.com/Alive/Gmc
But it is currently restricted by rL155136 / rL155362, which says:
```
    // This is a constant shift of a constant shift. Be careful about hiding
    // shl instructions behind bit masks. They are used to represent multiplies
    // by a constant, and it is important that simple arithmetic expressions
    // are still recognizable by scalar evolution.
    //
    // The transforms applied to shl are very similar to the transforms applied
    // to mul by constant. We can be more aggressive about optimizing right
    // shifts.
    //
    // Combinations of right and left shifts will still be optimized in
    // DAGCombine where scalar evolution no longer applies.
```

I think these tests show that for *constants*, SCEV has no issues with that canonicalization.

Reviewers: mkazantsev, spatel, efriedma, sanjoy

Reviewed By: mkazantsev

Subscribers: sanjoy, javed.absar, llvm-commits, stoklund, bixia

Differential Revision: https://reviews.llvm.org/D48229

llvm-svn: 335101
2018-06-20 07:54:11 +00:00
Roman Lebedev d23b6831de [X86][Znver1] Specify Register Files, RCU; FP scheduler capacity.
Summary:
First off: i do not have any access to that processor,
so this is purely theoretical, no benchmarks.

I have been looking into b**d**ver2 scheduling profile, and while cross-referencing
the existing b**t**ver2, znver1 profiles, and the reference docs
(`Software Optimization Guide for AMD Family {15,16,17}h Processors`),
i have noticed that only b**t**ver2 scheduling profile specifies these.

Also, there is no mca test coverage.

Reviewers: RKSimon, craig.topper, courbet, GGanesh, andreadb

Reviewed By: GGanesh

Subscribers: gbedwell, vprasad, ddibyend, shivaram, Ashutosh, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D47676

llvm-svn: 335099
2018-06-20 07:01:14 +00:00
Clement Courbet 7b9913fb9f [X86] Add sched class WriteLAHFSAHF and fix values.
Summary:
I ran llvm-exegesis on SKX, SKL, BDW, HSW, SNB.
Atom is from Agner and SLM is a guess.
I've left AMD processors alone.

Reviewers: RKSimon, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48079

llvm-svn: 335097
2018-06-20 06:13:39 +00:00