Commit Graph

1553 Commits

Author SHA1 Message Date
Simon Pilgrim 44a9a71d2a [TTI] Fix uses of SK_ExtractSubvector shuffle costs (PR39368)
Correct costings of SK_ExtractSubvector requires the SubTy argument to indicate the type/size of the extracted subvector.

Unlike the rest of the shuffle kinds this means that the main Ty argument represents the source vector type not the destination!

I've done my best to fix a number of vectorizer uses:

SLP - the reduction epilogue costs should be using a SK_PermuteSingleSrc shuffle as these all occur at the hardware vector width - we're not extracting (illegal) subvector types. This is causing the cost model diffs as SK_ExtractSubvector costs are poorly handled and tend to just return 1 at the moment.

LV - I'm not clear on what the SK_ExtractSubvector should represents for recurrences - I've used a <1 x ?> subvector extraction as that seems to match the VF delta.

Differential Revision: https://reviews.llvm.org/D53573

llvm-svn: 345617
2018-10-30 18:10:02 +00:00
Jonas Paulsson af8e036c29 [SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.
Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a
load. This patch adds a check for this.

Review: Ulrich Weigand
https://reviews.llvm.org/D53791

llvm-svn: 345596
2018-10-30 13:41:03 +00:00
Craig Topper faf423ca74 [X86] Add -LABEL to some FileCheck checks. NFC
llvm-svn: 345407
2018-10-26 17:21:19 +00:00
Jonas Paulsson b7caa809e1 [SystemZ] Improve getMemoryOpCost() to find foldable loads that are converted.
The SystemZ backend can do arithmetic of memory by loading and then extending
one of the operands. Similarly, a load + truncate can be folded into an
operand.

This patch improves the SystemZ TTI cost function to recognize this.

Review: Ulrich Weigand
https://reviews.llvm.org/D52692

llvm-svn: 345327
2018-10-25 22:28:25 +00:00
Jonas Paulsson 4645711a8d [SystemZ] Improve handling and cost estimates of vector integer div/rem
Enable the DAG optimization that converts vector div/rem with constants into
multiply+shifts sequences by expanding them early. This is needed since
ISD::SMUL_LOHI is 'Custom' lowered on SystemZ, and will therefore not be
available to BuildSDIV after legalization.

Better cost values for these instructions based on how they will be
implemented (a constant divisor is cheaper).

Review: Ulrich Weigand
https://reviews.llvm.org/D53196

llvm-svn: 345321
2018-10-25 21:47:22 +00:00
Simon Pilgrim 53e8e145e9 [CostModel][X86] Add realistic vXi64 uitofp vXf64 costs
Match codegen improvements from D53649/rL345256

llvm-svn: 345263
2018-10-25 13:06:20 +00:00
Simon Pilgrim 0573b8d8b6 [CostModel][X86] Add realistic i64 uitofp f64 scalar costs
llvm-svn: 345261
2018-10-25 12:42:10 +00:00
Simon Pilgrim ac84005841 [CostModel][X86] Add vXi8 vector division by constants costs.
ISD::MULHS/ISD::MULHU lowering of vXi8 types means we expand these in TargetLowering BuildSDIV/BuildUDIV.

llvm-svn: 345175
2018-10-24 18:44:12 +00:00
Simon Pilgrim 2cce074e8c [CostModel][X86] Enable non-uniform vector division by constants costs.
Non-uniform division/remainder handling was added back at D49248/D50765 - so share the 'mul+sub' costs that already exist for uniform cases.

llvm-svn: 345164
2018-10-24 17:30:29 +00:00
Simon Pilgrim f04a04c2b6 [TTI][X86] Treat SK_Transpose shuffles as SK_PermuteTwoSrc - there's no difference in lowering.
llvm-svn: 345048
2018-10-23 16:45:26 +00:00
Simon Pilgrim f1d8b7c49e [CostModel][X86] Add transpose shuffle cost tests
llvm-svn: 345045
2018-10-23 16:27:14 +00:00
Simon Pilgrim 9a2ee1ea2d Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345025
2018-10-23 13:14:54 +00:00
Simon Pilgrim 051feee5c2 Add BROADCAST shuffle cost tests.
Part of a lot of cleanup necessary before PR39368.

llvm-svn: 345023
2018-10-23 13:00:22 +00:00
Simon Pilgrim 8c3d87b8cf [ARM] Regenerate reverse shuffle costs
Came about while cleaning up general shuffle costs for PR39368

llvm-svn: 344966
2018-10-22 22:26:00 +00:00
Simon Pilgrim acd10b9f6c [CostModel][X86] Add some initial extract/insert subvector shuffle cost tests
Just f64/i64 tests initially to demonstrate PR39368

llvm-svn: 344857
2018-10-20 17:38:33 +00:00
Simon Pilgrim e612ab0c80 [CostModel][X86] Add integer vector reduction cost tests
llvm-svn: 344846
2018-10-20 14:29:59 +00:00
Thomas Lively fa54e56d84 [ConstantFolding] Constant fold minimum and maximum intrinsics
Summary: Depends on D52764

Reviewers: aheejin, dschuff

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52765

llvm-svn: 344796
2018-10-19 18:15:32 +00:00
Anna Thomas 6f732bfb79 [LV] Teach vectorizer about variant value store into uniform address
Summary:
Teach vectorizer about vectorizing variant value stores to uniform
address. Similar to rL343028, we do not allow vectorization if we have
multiple stores to the same uniform address.

Cost model already has the change for considering the extract
instruction cost for a variant value store. See added test cases for how
vectorization is done.
The patch also contains changes to the ORE messages.

Reviewers: Ayal, mkuper, anemet, hsaito

Subscribers: rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D52656

llvm-svn: 344613
2018-10-16 15:46:26 +00:00
Max Kazantsev fdfd98ceec [SCEV] Limit AddRec "simplifications" to avoid combinatorial explosions
SCEV's transform that turns `{A1,+,A2,+,...,+,An}<L> * {B1,+,B2,+,...,+,Bn}<L>` into
a single AddRec of size `2n+1` with complex combinatorial coefficients can easily
trigger exponential growth of the SCEV (in case if nothing gets folded and simplified).
We tried to restrain this transform using the option `scalar-evolution-max-add-rec-size`,
but its default value seems to be insufficiently small: the test attached to this patch
with default value of this option `16` has a SCEV of >3M symbols (when printed out).

This patch reduces the simplification limit. It is not a cure to combinatorial
explosions, but at least it reduces this corner case to something more or less
reasonable.

Differential Revision: https://reviews.llvm.org/D53282
Reviewed By: sanjoy

llvm-svn: 344584
2018-10-16 05:26:21 +00:00
Jonas Paulsson bf66f38705 [SystemZ] Temporarily disable high VFs with integer div/rem.
Until mischeduler is clever enough to avoid spilling in a vectorized loop
with many (scalar) DLRs it is better to avoid high vectorization factors (8
and above).

llvm-svn: 344129
2018-10-10 09:30:29 +00:00
Jonas Paulsson 2c8b33770c [SystemZ] Take better care when computing needed vector registers in TTI.
A new function getNumVectorRegs() is better to use for the number of needed
vector registers instead of getNumberOfParts(). This is to make sure that the
number of vector registers (and typically operations) required for a vector
type is accurate.

getNumberOfParts() which was previously used works by splitting the vector
type until it is legal gives incorrect results for types with a non
power of two number of elements (rare).

A new static function getScalarSizeInBits() that also checks for a pointer
type and returns 64U for it since otherwise it gets a value of 0). Used in a
few places where Ty may be pointer.

Review: Ulrich Weigand
llvm-svn: 344115
2018-10-10 07:36:27 +00:00
George Burgess IV d98d505c0d [Analysis] Make LocationSize pretty-printing more descriptive
This is the third patch in a series intended to make
https://reviews.llvm.org/D44748 more easily reviewable. Please see that
patch for more context. The second being r344013.

The intent is to make the output of printing a LocationSize more
precise. The main motivation for this is that we plan to add a bit to
distinguish whether a given LocationSize is an upper-bound or is
precise; making that information available in pretty-printing is nice.

llvm-svn: 344108
2018-10-10 01:35:22 +00:00
Anna Thomas b1e3d45318 [LV][LAA] Vectorize loop invariant values stored into loop invariant address
Summary:
We are overly conservative in loop vectorizer with respect to stores to loop
invariant addresses.
More details in https://bugs.llvm.org/show_bug.cgi?id=38546
This is the first part of the fix where we start with vectorizing loop invariant
values to loop invariant addresses.

This also includes changes to ORE for stores to invariant address.

Reviewers: anemet, Ayal, mkuper, mssimpso

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50665

llvm-svn: 343028
2018-09-25 20:57:20 +00:00
Jonas Paulsson 77df2f2f38 [SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPM
After recent improvements which makes better use of LOC instead of IPM, the
TTI cost functions also needs to be updated to reflect this.

This involves sext, zext and xor of i1.

The tests were updated so that for z13 the new costs are expected, while the
old costs are still checked for on zEC12.

Review: Ulrich Weigand
https://reviews.llvm.org/D51339

llvm-svn: 342207
2018-09-14 06:46:55 +00:00
Peter Collingbourne c7d281905b Prevent Constant Folding From Optimizing inrange GEP
This patch does the following things:

1. update SymbolicallyEvaluateGEP so that it bails out if it cannot preserve inrange arribute;
2. update llvm/test/Analysis/ConstantFolding/gep.ll to remove UB in it;
3. remove inaccurate comment above ConstantFoldInstOperandsImpl in llvm/lib/Analysis/ConstantFolding.cpp;
4. add a new regression test that makes sure that no optimizations change an inrange GEP in an unexpected way.

Patch by Zhaomo Yang!

Differential Revision: https://reviews.llvm.org/D51698

llvm-svn: 341888
2018-09-11 01:53:36 +00:00
Philip Reames 9f09161290 [AST] Add test coverage of memsets
Immediately after posting https://reviews.llvm.org/D51895, I noticed a small bug.  These tests would have caught that.

llvm-svn: 341880
2018-09-10 23:14:30 +00:00
Philip Reames 5660bd460b [AST] Visit memtransfer arguments in order
The only point to this change is the test diffs.  When I remove this code entirely (in favor of the recently added generic handling), I don't want there to be any confusion due to spurious test diffs.

As an aside, the fact out tests are AST construction order dependent is not great.  I thought about fixing that, but the reasonable schemes I might want (e.g. sort by name) need the test diffs anyways.

Philip

llvm-svn: 341841
2018-09-10 16:00:27 +00:00
Tim Northover 12c1f7675f InstCombine: move hasOneUse check to the top of foldICmpAddConstant
There were two combines not covered by the check before now, neither of which
actually differed from normal in the benefit analysis.

The most recent seems to be because it was just added at the top of the
function (naturally). The older is from way back in 2008 (r46687) when we just
didn't put those checks in so routinely, and has been diligently maintained
since.

llvm-svn: 341831
2018-09-10 14:26:44 +00:00
Matt Arsenault 72d27f5525 AMDGPU: Fix tests using old number for constant address space
llvm-svn: 341770
2018-09-10 02:54:25 +00:00
Philip Reames cb8b3278e5 [AST] Generalize argument specific aliasing
AliasSetTracker has special case handling for memset, memcpy and memmove which pre-existed argmemonly on functions and readonly and writeonly on arguments. This patch generalizes it using the AA infrastructure to any call correctly annotated.

The motivation here is to cut down on confusion, not performance per se. For most instructions, there is a direct mapping to alias set. However, this is not guaranteed by the interface and was not in fact true for these three intrinsics *and only these three intrinsics*. I kept getting myself confused about this invariant, so I figured it would be good to clearly distinguish between a instructions and alias sets. Calls happened to be an easy target.

The nice side effect is that custom implementations of memset/memcpy/memmove - including wrappers discovered by IPO - can now be optimized the same as builts by LICM.

Note: The actual removal of the memset/memtransfer specific handling will happen in a follow on NFC patch.  It was originally part of this one, but separate for ease of review and rebase.

Differential Revision: https://reviews.llvm.org/D50730

llvm-svn: 341713
2018-09-07 21:36:11 +00:00
Nicola Zaghen 9588ad9611 [InstCombine] Fold icmp ugt/ult (add nuw X, C2), C --> icmp ugt/ult X, (C - C2)
Support for sgt/slt was added in rL294898, this adds the same cases also for unsigned compares.

This is the Alive proof: https://rise4fun.com/Alive/nyY

Differential Revision: https://reviews.llvm.org/D50972

llvm-svn: 341353
2018-09-04 10:29:48 +00:00
Nicolai Haehnle 65c026259e Move test/Analysis/DivergenceAnalysis/AMDGPU/loads.ll
Should fix failures of buildbots that don't build the AMDGPU backend.

Change-Id: I01cb84b4b47803b10c5b21ea0353546239860a51
llvm-svn: 341079
2018-08-30 15:24:00 +00:00
Nicolai Haehnle 35617ed4cb [NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).

The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.

Patch by: Simon Moll

Reviewed By: nhaehnle

Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D50434

Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
2018-08-30 14:21:36 +00:00
Sanjay Patel 7a05641fa8 [InstCombine] remove unnecessary shuffle undef folding
Add a test for constant folding to show that 
(shuffle undef, undef, mask)
should already be handled via instsimplify.

llvm-svn: 340926
2018-08-29 13:24:34 +00:00
Kit Barton 7c80f98b69 [PPC] Remove Darwin support from POWER backend.
This patch issues an error message if Darwin ABI is attempted with the PPC
backend. It also cleans up existing test cases, either converting the test to
use an alternative triple or removing the test if the coverage is no longer
needed.

Updated Tests
-------------
The majority of test cases were updated to use a different triple that does not
include the Darwin ABI. Many tests were also updated to use FileCheck, in place
of grep.

Deleted Tests
-------------
llvm/test/tools/dsymutil/PowerPC/sibling.test was originally added to test
specific functionality of dsymutil using an object file created with an old
version of llvm-gcc for a Powerbook G4. After a discussion with @JDevlieghere he
suggested removing the test.

llvm/test/CodeGen/PowerPC/combine_loads_from_build_pair.ll was converted from a
PPC test to a SystemZ test, as the behavior is also reproducible there.

All other tests that were deleted were specific to the darwin/ppc ABI and no
longer necessary.

Phabricator Review: https://reviews.llvm.org/D50988

llvm-svn: 340795
2018-08-28 01:18:29 +00:00
Craig Topper a72012c206 [X86] Correct the cost of (v4i32 (fptoui (v4f64))) under AVX512F.
Summary: This was inheriting the cost from the AVX table, but should be legal under AVX512.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D51267

llvm-svn: 340708
2018-08-26 18:47:44 +00:00
John Brawn 980da83f84 [PhiValues] Use callback value handles to invalidate deleted values
The way that PhiValues is integrated with BasicAA it is possible for a pass
which uses BasicAA to pick up an instance of BasicAA that uses PhiValues without
intending to, and then delete values from a function in a way that causes
PhiValues to return dangling pointers to these deleted values. Fix this by
having a set of callback value handles to invalidate values when they're
deleted.

llvm-svn: 340613
2018-08-24 15:48:30 +00:00
Brian Homerding 3ecabd709f [FunctionAttrs] Infer WriteOnly Function Attribute
These changes expand the FunctionAttr logic in order to mark functions as
WriteOnly when appropriate. This is done through an additional bool variable
and extended logic.

Reviewers: hfinkel, jdoerfert

Differential Revision: https://reviews.llvm.org/D48387

llvm-svn: 340537
2018-08-23 15:05:22 +00:00
Philip Reames 6b6d2e0105 [AST] Add a test for attribute intersection
Already works, but I initially convinced myself it doesn't, so add a test which shows it does.  :)

llvm-svn: 340453
2018-08-22 21:10:56 +00:00
Scott Linder 72855e36c5 [AMDGPU] Consider loads from flat addrspace to be potentially divergent
In general we can't assume flat loads are uniform, and cases where we can prove
they are should be handled through infer-address-spaces.

Differential Revision: https://reviews.llvm.org/D50991

llvm-svn: 340343
2018-08-21 21:24:31 +00:00
Philip Reames c3c23e8cf2 [AST] Remove notion of volatile from alias sets [NFCI]
Volatility is not an aliasing property. We used to model volatile as if it had extremely conservative aliasing implications, but that hasn't been true for several years now. So, it doesn't make sense to be in AliasSet.

It also turns out the code is entirely a noop. Outside of the AST code to update it, there was only one user: load store promotion in LICM. L/S promotion doesn't need the check since it walks all the users of the address anyway. It already checks each load or store via !isUnordered which causes us to bail for volatile accesses. (Look at the lines immediately following the two remove asserts.)

There is the possibility of some small compile time impact here, but the only case which will get noticeably slower is a loop with a large number of loads and stores to the same address where only the last one we inspect is volatile. This is sufficiently rare it's not worth optimizing for..

llvm-svn: 340312
2018-08-21 17:59:11 +00:00
Sanjay Patel 3ce999fa41 [ConstantFolding] improve folding of binops with vector undef operand
A non-undef operand may still have undef constant elements, 
so we should always propagate the vector results per-lane.

llvm-svn: 340194
2018-08-20 18:19:02 +00:00
Sanjay Patel 7ff7bd9b3c [ConstantFolding] add tests for binops on vectors with undef elements; NFC
llvm-svn: 340190
2018-08-20 17:31:34 +00:00
Philip Reames 96bc076c3a [AST] Clarify printing of unknown size locations [NFC]
Printing "unknown" is much more clear than an arbitrary large integer

llvm-svn: 340108
2018-08-17 23:17:31 +00:00
Philip Reames 26f6176f38 [AST][Tests] Clarify what each test is doing
llvm-svn: 340100
2018-08-17 21:58:26 +00:00
Philip Reames 9e313167cf [AST[Tests] Shorten tests using noalias params
llvm-svn: 340099
2018-08-17 21:45:57 +00:00
Philip Reames 079c92e201 [AST] Add tests for argmemonly calls [NFC]
First step towards building a test set to rebase D50730 on top of.  Starting with clone of memtransfer tests, more to come.

llvm-svn: 340095
2018-08-17 21:42:18 +00:00
Sanjay Patel 411b86081e [ConstantFolding] add simplifications for funnel shift intrinsics
This is another step towards being able to canonicalize to the funnel shift 
intrinsics in IR (see D49242 for the initial patch). 
We should not have any loss of simplification power in IR between these and 
the equivalent IR constructs.

Differential Revision: https://reviews.llvm.org/D50848

llvm-svn: 340022
2018-08-17 13:23:44 +00:00
Max Kazantsev 7b78d3920c [MustExecute] Fix algorithmic bug in isGuaranteedToExecute. PR38514
The description of `isGuaranteedToExecute` does not correspond to its implementation.
According to description, it should return `true` if an instruction is executed under the
assumption that its loop is *entered*. However there is a sophisticated alrogithm inside
that tries to prove that the instruction is executed if the loop is *exited*, which is not the
same thing for infinite loops. There is an attempt to protect from dealing with infinite loops
by prohibiting loops without exit blocks, however an infinite loop can have exit blocks.

As result of that, MustExecute can falsely consider some blocks that are never entered as
mustexec, and LICM can hoist dangerous instructions out of them basing on this fact.
This may introduce UB to programs which did not contain it initially.

This patch removes the problematic algorithm and replaced it with a one which tries to
prove what is required in description.

Differential Revision: https://reviews.llvm.org/D50558
Reviewed By: reames

llvm-svn: 339984
2018-08-17 06:19:17 +00:00
Sanjay Patel 0ea8d8b951 [ConstantFolding] add tests for funnel shift intrinsics; NFC
No functionality for this yet.

llvm-svn: 339889
2018-08-16 16:10:42 +00:00