Commit Graph

1251 Commits

Author SHA1 Message Date
Simon Pilgrim c22c889f77 Fix line endings
llvm-svn: 291663
2017-01-11 10:25:31 +00:00
Mohammed Agabaria 2c96c43388 [X86] updating TTI costs for arithmetic instructions on X86\SLM arch.
updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. 
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104 

llvm-svn: 291657
2017-01-11 08:23:37 +00:00
Evandro Menezes 330e1b8945 [AArch64] Consider all vector types for FeatureSlowMisaligned128Store
The original code considered only v2i64 as slow for this feature. This patch
consider all 128-bit long vector types as slow candidates.

In internal tests, extending this feature to all 128-bit vector types
resulted in an overall improvement of 1% on Exynos M1.

Differential revision: https://reviews.llvm.org/D27998

llvm-svn: 291616
2017-01-10 23:42:21 +00:00
Simon Pilgrim b6d4fa6551 [CostModel][X86] Add AVX512VL vector shift cost tests.
llvm-svn: 291585
2017-01-10 19:04:12 +00:00
Sanjay Patel baac743254 [ValueTracking] regenerate checks; NFC
llvm-svn: 291468
2017-01-09 19:31:20 +00:00
Chandler Carruth 082c183f06 [PM] Teach SCEV to invalidate itself when its dependencies become
invalid.

This fixes use-after-free bugs that will arise with any interesting use
of SCEV.

I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.

llvm-svn: 291426
2017-01-09 07:44:34 +00:00
Simon Pilgrim 9c58950eeb [CostModel][X86] Fixed vXi8 uniform shift costs.
The 'fast' costs should only work for shifts by uniform constants (uniform non-constant are lowered using the slow default implementation).

Logical shifts were not taking into account that we must mask the psrlw result, so the costs needed to be doubled.

Added missing AVX2/AVX512BW costs as well.

llvm-svn: 291391
2017-01-08 14:14:36 +00:00
Simon Pilgrim 1fa5487c05 [CostModel][X86] Moved legal uniform shift costs earlier.
XOP was prematurely matching, doubling the cost of ashr/lshr uniform shifts.

llvm-svn: 291390
2017-01-08 13:12:03 +00:00
Simon Pilgrim 9681c407b4 [CostModel][X86] Update SSE41/AVX1 vXi32 SHL costs
SSE41 provides pmulld which allows the simpler pslld/paddd/cvttps2dq/pmulld pattern than SSE2's use of pmuludq.

llvm-svn: 291372
2017-01-07 22:27:43 +00:00
Simon Pilgrim a470296367 [CostModel][X86] Fix AVX2 v16i16 shift 'splat' costs.
llvm-svn: 291366
2017-01-07 22:08:09 +00:00
Simon Pilgrim 82e3e05fe2 [CostModel][X86] Match 256-bit vector shift 'splat' costs for AVX2 and above
We were matching against general vector shift costs before the uniform splat costs

llvm-svn: 291365
2017-01-07 21:47:10 +00:00
Simon Pilgrim a4109d6433 [CostModel][AVX512BW] Add v32i16 vector shift costs for avx512bw targets.
llvm-svn: 291354
2017-01-07 17:54:10 +00:00
Simon Pilgrim a1b8e2c725 [X86][AVX512] Use lowerShuffleAsRepeatedMaskAndLanePermute for non-VBMI v64i8 shuffles (PR31470)
llvm-svn: 291347
2017-01-07 15:37:50 +00:00
Simon Pilgrim 9cbcc5ff0b [CostModel][X86] Add AVX512 and 512-bit vector shift cost tests.
llvm-svn: 291269
2017-01-06 19:41:26 +00:00
Chad Rosier e177185e79 [AArch64] Reduce vector insert/extract cost for Falkor.
Differential Revision: https://reviews.llvm.org/D28403

llvm-svn: 291254
2017-01-06 18:03:26 +00:00
Simon Pilgrim d8333372bc [CostModel][X86] Fix 512-bit SDIV/UDIV 'big' costs.
Set the costs on the lowest target that supports the type.

llvm-svn: 291229
2017-01-06 11:12:53 +00:00
Simon Pilgrim 441d1d35d2 [CostModel][X86] Add SDIV/UDIV cost tests for a wider range of targets
Added a test demonstrating bug in AVX512 division costs

llvm-svn: 291228
2017-01-06 11:02:40 +00:00
Simon Pilgrim b01e844241 [CostModel][X86] Include the cost of 256-bit upper subvector extract/insertion in AVX1 v4i64 MUL
Matches other MUL/ADD/SUB 256-bit case on AVX1

llvm-svn: 291149
2017-01-05 18:20:25 +00:00
Chad Rosier e20a3a4831 [AArch64][CostModel] Add coverage for bswap intrinsics.
llvm-svn: 291140
2017-01-05 16:55:32 +00:00
Simon Pilgrim bca02f9e20 [CostModel][X86] Add support for broadcast shuffle costs
Currently only for broadcasts with input and output of the same width.

Differential Revision: https://reviews.llvm.org/D27811

llvm-svn: 291122
2017-01-05 15:56:08 +00:00
Chad Rosier 3ccd1dffff [AArch64] Remove mcpu option as this test is not target specific. NFC.
llvm-svn: 291117
2017-01-05 15:05:03 +00:00
Chad Rosier e1dc73d9a7 [AArch64] Remove unused arguments from tests. NFC.
llvm-svn: 291112
2017-01-05 14:48:53 +00:00
Tobias Grosser 9d88b858c8 Add missing CHECK: line to test case added in 29097
Without this CHECK line, we may not detect incorrectly detected additional
regions at the end of the region tree.

llvm-svn: 290994
2017-01-04 19:35:38 +00:00
Tobias Grosser 8ab80ba3a2 RegionInfo: add new test case
This test case has been reduced from test/Analysis/RegionInfo/mix_1.ll and
provides us with a minimal example of a test case which caused problems while
working on an improved version of the RegionInfo analysis. We upstream this
test case, as it certainly can be helpful in future debugging and optimization
tests.

Test case reduced by Pratik Bhatu <cs12b1010@iith.ac.in>

llvm-svn: 290974
2017-01-04 17:50:15 +00:00
Simon Pilgrim bb895f3e9c [CostModel][X86] Updated vXi8 and vXi16 Reverse/Alternate shuffle costs
Actual codegen is much better than the extract+insert patterns that was assumed.

llvm-svn: 290962
2017-01-04 14:01:33 +00:00
Elena Demikhovsky d96200d60a Fixed shuffle-reverse cost on AVX-512.
(This changed was approved in https://reviews.llvm.org/D28118, but Simon asked to submit it separately).

llvm-svn: 290812
2017-01-02 11:44:10 +00:00
Elena Demikhovsky 21706cbd24 AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.
X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost.

In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426).

* Shiffle-broadcast cost will be changed in Simon's upcoming patch.

Differential Revision: https://reviews.llvm.org/D28118

llvm-svn: 290810
2017-01-02 10:37:52 +00:00
Sanjay Patel 5865d12e9f [ValueTracking] add tests for known-nonnull-at; NFC
llvm-svn: 290790
2016-12-31 19:23:26 +00:00
Sanjoy Das 00d76a5754 [TBAAVerifier] Be stricter around verifying scalar nodes
This fixes the issue exposed in PR31393, where we weren't trying
sufficiently hard to diagnose bad TBAA metadata.

This does reduce the variety in the error messages we print out, but I
think the tradeoff of verifying more, simply and quickly overrules the
need for more helpful error messags here.

llvm-svn: 290713
2016-12-29 15:47:05 +00:00
Chandler Carruth e14524ca30 [PM] Teach MemDep to invalidate its result object when its cached
analysis handles become invalid.

Add a test case for its invalidation logic.

llvm-svn: 290620
2016-12-27 19:33:04 +00:00
Chandler Carruth 7a73eabf64 [PM] Add more dedicated testing to cover the invalidation logic added to
BasicAA in r290603.

I've kept the basic testing in the new PM test file as that also covers
the AAManager invalidation logic. If/when there is a good place for
broader AA testing it could move there.

This test is somewhat unsatisfying as I can't get it to fail even with
ASan outside of explicit checks of the invalidation. Apparently we don't
yet have any test coverage of the BasicAA code paths using either the
domtree or loopinfo -- I made both of them always be null and check-llvm
passed.

llvm-svn: 290612
2016-12-27 17:59:22 +00:00
Bryant Wong a07d9b1460 [AliasAnalysis] Teach BasicAA about memcpy.
Differential Revision: https://reviews.llvm.org/D27034

llvm-svn: 290526
2016-12-25 22:42:27 +00:00
Simon Pilgrim 081abbb164 [X86][SSE] Improve lowering of vXi64 multiplies
As mentioned on PR30845, we were performing our vXi64 multiplication as:

AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32);

when we could avoid one of the upper shifts with:

AloBlo = pmuludq(a, b);
AloBhi = pmuludq(a, psrlqi(b, 32));
AhiBlo = pmuludq(psrlqi(a, 32), b);
return AloBlo + psllqi(AloBhi + AhiBlo, 32);

This matches the lowering on gcc/icc.

Differential Revision: https://reviews.llvm.org/D27756

llvm-svn: 290267
2016-12-21 20:00:10 +00:00
Michael Kuperstein dd92c78669 [ConstantFolding] Fix vector GEPs harder
For vector GEPs, CastGEPIndices can end up in an infinite recursion, because
we compare the vector type to the scalar pointer type, find them different,
and then try to cast a type to itself.

Differential Revision: https://reviews.llvm.org/D28009

llvm-svn: 290260
2016-12-21 17:34:21 +00:00
Daniel Jasper f5123fecfe Add files I seem to have dropped in my revert (r290086).
Sorry!

llvm-svn: 290087
2016-12-19 08:32:13 +00:00
Daniel Jasper aec2fa352f Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

llvm-svn: 290086
2016-12-19 08:22:17 +00:00
Matthew Simpson 2c8de192a1 [AArch64] Guard Misaligned 128-bit store penalty by subtarget feature
This patch checks that the SlowMisaligned128Store subtarget feature is set
when penalizing such stores in getMemoryOpCost.

Differential Revision: https://reviews.llvm.org/D27677

llvm-svn: 289845
2016-12-15 18:36:59 +00:00
Simon Pilgrim 2f7f0e7a48 [CostModel][X86] Updated reverse shuffle costs
llvm-svn: 289819
2016-12-15 14:24:07 +00:00
Simon Pilgrim 9876ed07f6 [CostModel] Fix long standing bug with reverse shuffle mask detection
Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles

llvm-svn: 289811
2016-12-15 12:12:45 +00:00
Simon Pilgrim 9ebeac3eed [CostModel][X86] Add tests for reverse shuffle costs
llvm-svn: 289800
2016-12-15 10:45:53 +00:00
Hal Finkel 3ca4a6bcf1 Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

llvm-svn: 289756
2016-12-15 03:02:15 +00:00
Hal Finkel cb9f78e1c3 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

llvm-svn: 289755
2016-12-15 02:53:42 +00:00
Sanjoy Das 3336f681e3 [Verifier] Add verification for TBAA metadata
Summary:
This change adds some verification in the IR verifier around struct path
TBAA metadata.

Other than some basic sanity checks (e.g. we get constant integers where
we expect constant integers), this checks:

 - That by the time an struct access tuple `(base-type, offset)` is
   "reduced" to a scalar base type, the offset is `0`.  For instance, in
   C++ you can't start from, say `("struct-a", 16)`, and end up with
   `("int", 4)` -- by the time the base type is `"int"`, the offset
   better be zero.  In particular, a variant of this invariant is needed
   for `llvm::getMostGenericTBAA` to be correct.

 - That there are no cycles in a struct path.

 - That struct type nodes have their offsets listed in an ascending
   order.

 - That when generating the struct access path, you eventually reach the
   access type listed in the tbaa tag node.

Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26438

llvm-svn: 289402
2016-12-11 20:07:15 +00:00
Keno Fischer dc09119776 ConstantFolding: Don't crash when encountering vector GEP
ConstantFolding tried to cast one of the scalar indices to a vector
type. Instead, use the vector type only for the first index (which
is the only one allowed to be a vector) and use its scalar type
otherwise.

Fixes PR31250.

Reviewers: majnemer
Differential Revision: https://reviews.llvm.org/D27389

llvm-svn: 289073
2016-12-08 17:22:35 +00:00
Haicheng Wu f8b834049a [AArch64] Correct the check of signed 9-bit imm in isLegalAddressingMode()
In the addressing mode, signed 9-bit imm is [-256, 255], not [-512, 511].

Differential Revision: https://reviews.llvm.org/D27480

llvm-svn: 288876
2016-12-07 01:45:04 +00:00
Haicheng Wu 584042981d [TTI/CostModel] Correct the way getGEPCost() calls isLegalAddressingMode()
Fix a bug when we call isLegalAddressingMode() from getGEPCost().

Differential Revision: https://reviews.llvm.org/D27357

llvm-svn: 288569
2016-12-03 01:57:24 +00:00
Guozhi Wei 835de1f3ab [ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.

Differential Revision: https://reviews.llvm.org/D26713

llvm-svn: 288560
2016-12-03 00:41:43 +00:00
Alexey Bataev 62af7252f1 [SLP] Fixed cost model for horizontal reduction.
Currently when cost of scalar operations is evaluated the vector type is
used for scalar operations. Patch fixes this issue and fixes evaluation
of the vector operations cost.
Several test showed that vector cost model is too optimistic. It
allowed vectorization of 8 or less add/fadd operations, though scalar
code is faster. Actually, only for 16 or more operations vector code
provides better performance.

Differential Revision: https://reviews.llvm.org/D26277

llvm-svn: 288398
2016-12-01 18:42:42 +00:00
Alexey Bataev fc617690ab [SLP] Additional tests with the cost of vector operations.
llvm-svn: 288377
2016-12-01 17:26:54 +00:00
Alexey Bataev e59a8351d0 Revert "[SLP] Additional tests with the cost of vector operations."
This reverts commit a61718435fc4118c82f8aa6133fd81f803789c1e.

llvm-svn: 288371
2016-12-01 16:45:04 +00:00
Alexey Bataev 2ff768475d [SLP] Additional tests with the cost of vector operations.
llvm-svn: 288369
2016-12-01 16:11:48 +00:00
Sanjay Patel 8ca30ab0c5 [InstSimplify] allow integer vector types to use computeKnownBits
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.

llvm-svn: 288005
2016-11-27 21:07:28 +00:00
Sanjay Patel dc2917b969 add tests to show missing analysis; NFC
llvm-svn: 287998
2016-11-27 15:54:45 +00:00
Simon Pilgrim 841d7ca463 [X86][AVX512] Add support for v2i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances

llvm-svn: 287882
2016-11-24 14:46:55 +00:00
Simon Pilgrim 4e9b9cbee9 [X86][AVX512] Add support for v4i64 fptosi/fptoui/sitofp/uitofp on AVX512DQ-only targets
Use 512-bit instructions with subvector insertion/extraction like we do in a number of similar circumstances

llvm-svn: 287762
2016-11-23 14:01:18 +00:00
Simon Pilgrim 03cd8f887c [CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs
llvm-svn: 287760
2016-11-23 13:42:09 +00:00
Simon Pilgrim e5dbdbefca [CostModel][X86] Add v2f32 -> v2i64 fptosi/fptoui cost tests
llvm-svn: 287756
2016-11-23 11:43:00 +00:00
Simon Pilgrim d1aed9a9e6 [CostModel][X86] Updated sitofp/uitofp scalar/vector cost tests
Better coverage of all legal types + special cases.

Removed old fptoui tests which are all handled in fptoui.ll

llvm-svn: 287678
2016-11-22 18:55:49 +00:00
Yaxun Liu 02f75f31e0 Fix known zero bits for addrspacecast.
Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1.

This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand.

Differential Revision: https://reviews.llvm.org/D26803

llvm-svn: 287545
2016-11-21 15:42:31 +00:00
Craig Topper 07f1c15995 [AVX-512] Support FCOPYSIGN for v16f32 and v8f64
Summary:
This extends FCOPYSIGN support to 512-bit vectors.

I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.

Reviewers: delena, zvi, RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26791

llvm-svn: 287298
2016-11-18 02:25:34 +00:00
Simon Pilgrim 779da8e5ea [CostModel][X86] Added mul costs for vXi8 vectors
More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result

llvm-svn: 286838
2016-11-14 15:54:24 +00:00
Simon Pilgrim 27fed8e5d6 [X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargets
Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason.

This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit)

llvm-svn: 286832
2016-11-14 14:45:16 +00:00
Peter Collingbourne d93620bf4d IR: Introduce inrange attribute on getelementptr indices.
If the inrange keyword is present before any index, loading from or
storing to any pointer derived from the getelementptr has undefined
behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as inrange.

This can be used, e.g. for alias analysis or to split globals at element
boundaries where beneficial.

As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html

Differential Revision: https://reviews.llvm.org/D22793

llvm-svn: 286514
2016-11-10 22:34:55 +00:00
Tobias Grosser 455b9bd65c [RegionInfo] Add three tests that include infinite loops
These examples are variations that were inspired from a small subgraph taken
from paper.ll which are interesting as they show certain issues with infinite
loops.

llvm-svn: 286450
2016-11-10 13:56:19 +00:00
Andrew Kaylor 9604f34996 [BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes
Differential Revision: https://reviews.llvm.org/D26382

llvm-svn: 286294
2016-11-08 21:07:42 +00:00
Simon Pilgrim d02c55204b [VectorLegalizer] Expansion of CTLZ using CTPOP when possible
This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available.

This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful.

Differential Revision: https://reviews.llvm.org/D25910

llvm-svn: 286233
2016-11-08 14:10:28 +00:00
Chad Rosier 8f348017b0 [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D26252

llvm-svn: 286108
2016-11-07 14:11:45 +00:00
Alexey Bataev d07c731d86 Improved cost model for FDIV and FSQRT, by Andrew Tischenko
There is a bug describing poor cost model for floating point operations:
Bug 29083 - [X86][SSE] Improve costs for floating point operations. This
patch is the second one in series of patches dealing with cost model.

Differential Revision: https://reviews.llvm.org/D25722

llvm-svn: 285564
2016-10-31 12:10:53 +00:00
Tom Stellard 13068995b9 [Loads] Fix crash in is isDereferenceableAndAlignedPointer()
Summary:
We were trying to add APInt values with different bit sizes after
visiting an addrspacecast instruction which changed the bit width
of the pointer.

Reviewers: majnemer, hfinkel

Subscribers: hfinkel, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24774

llvm-svn: 285407
2016-10-28 15:32:28 +00:00
Simon Pilgrim d23219b9ee [X86][AVX512] Fix MUL v8i64 costs on non-AVX512DQ targets
llvm-svn: 285329
2016-10-27 18:32:06 +00:00
Simon Pilgrim 820e1326d7 [X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64
With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq).

Updated cost table accordingly.

Differential Revision: https://reviews.llvm.org/D26011

llvm-svn: 285304
2016-10-27 15:27:00 +00:00
Chad Rosier 4447d7a816 Revert "[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory."
This reverts commit r285191.

LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent
code from being moved outside of the original scope.

llvm-svn: 285227
2016-10-26 19:18:19 +00:00
Chad Rosier 1408628ffa [AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory.
Differential Revision: https://reviews.llvm.org/D25969

llvm-svn: 285191
2016-10-26 12:42:11 +00:00
Eli Friedman c5b7262073 Fix regression from my recent GlobalsAA fix.
There are two fixes here: one, AnalyzeUsesOfPointer can't return
false until it has checked all the uses of the pointer. Two, if a
global uses another global, we have to assume the address of the
first global escapes.

Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 .

Differential Revision: https://reviews.llvm.org/D25798

llvm-svn: 285034
2016-10-24 21:47:44 +00:00
Simon Pilgrim d09c04d267 [CostModel][X86] Added tests for current integer signed/unsigned remainder costs
llvm-svn: 284940
2016-10-23 18:35:02 +00:00
Simon Pilgrim 6ac1e98b09 [X86][SSE] Add SSE41/AVX1 costs for vector shifts.
We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results.

llvm-svn: 284939
2016-10-23 16:49:04 +00:00
Simon Pilgrim e16b1e2271 [CostModel][X86] Added tests for current integer trunc costs
llvm-svn: 284938
2016-10-23 15:17:52 +00:00
Gerolf Hoflehner 9e2afa8bd7 [BasicAA] Fix - missed alias in GEP expressions
In BasicAA GEP operand values get adjusted ("wrap-around") based on the
pointersize. Otherwise, in non-64b modes, AA could report false negatives.
However, a wrap-around is valid only for a fully evaluated expression.
It had been introduced to fix an alias problem in
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html.
This commit restricts the wrap-around to constant gep operands only where the
value is known at compile-time.

llvm-svn: 284908
2016-10-22 02:41:39 +00:00
Li Huang faa857dba7 [SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor.
Summary:
When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV
multiple times if this SCEV is referenced by multiple other SCEVs. This has
exponential time complexity in the worst case. Memoizing the results will
avoid re-visiting the same SCEV. Add a map to save the results, and override
the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each
SCEV once and thus returns the same result for the same input SCEV.

This patch fixes PR18606, PR18607.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25810

llvm-svn: 284868
2016-10-21 20:05:21 +00:00
John Brawn 84b21835f1 [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops
When we have a loop with a known upper bound on the number of iterations, and
furthermore know that either the number of iterations will be either exactly
that upper bound or zero, then we can fully unroll up to that upper bound
keeping only the first loop test to check for the zero iteration case.

Most of the work here is in plumbing this 'max-or-zero' information from the
part of scalar evolution where it's detected through to loop unrolling. I've
also gone for the safe default of 'false' everywhere but howManyLessThans which
could probably be improved.

Differential Revision: https://reviews.llvm.org/D25682

llvm-svn: 284818
2016-10-21 11:08:48 +00:00
Li Huang fcfe8cd3ae [SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV
This is to avoid inlining too many multiplication operands into a SCEV, which could 
take exponential time in the worst case.

Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin

Differential Revision: https://reviews.llvm.org/D25794

llvm-svn: 284784
2016-10-20 21:38:39 +00:00
Simon Pilgrim 365be4f95c [CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv uniformconst costs for 256/512 bit integer vectors
We weren't checking for uniform const costs before the general cost, resulting in very high estimates.

llvm-svn: 284755
2016-10-20 18:00:35 +00:00
Simon Pilgrim 1388c0acc1 [CostModel][X86] Added tests for sdiv/udiv costs for uniform const and uniform const power-of-2
Shows poor costings in AVX1/AVX512BW for certain vector types

llvm-svn: 284748
2016-10-20 17:16:38 +00:00
Simon Pilgrim 025e26dd32 [CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv general costs for 256/512 bit integer vectors
We weren't accounting for legal types on every subtarget, meaning that many of the costs were using defaults.

We still don't correctly cost (or test) the 512-bit sdiv/udiv by uniform const cases, nor the power-of-2 cases.

llvm-svn: 284744
2016-10-20 16:39:11 +00:00
Simon Pilgrim 16cc616ebc [CostModel][X86] Added tests for sdiv/udiv costs for scalar and 128/256/512 bit integer vectors
Shows current bug in AVX1/AVX512BW costs for 256 bit vector types

llvm-svn: 284723
2016-10-20 12:34:00 +00:00
Chad Rosier 6e3a92ec88 [AliasSetTracker] Add support for memcpy and memmove.
Differential Revision: https://reviews.llvm.org/D25776

llvm-svn: 284630
2016-10-19 19:09:03 +00:00
John Brawn ecf79300dd [SCEV] More accurate calculation of max backedge count of some less-than loops
In loops that look something like
 i = n;
 do {
  ...
 } while(i++ < n+k);
where k is a constant, the maximum backedge count is k (in fact the backedge
count will be either 0 or k, depending on whether n+k wraps). More generally
for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will
iterate either 0 or that constant number of times.

This allows for more loop unrolling with the recent upper bound loop unrolling
changes, and I'm working on a patch that will let loop unrolling additionally
make use of the loop being executed either 0 or k times (we need to retain the
loop comparison only on the first unrolled iteration).

Differential Revision: https://reviews.llvm.org/D25607

llvm-svn: 284465
2016-10-18 10:10:53 +00:00
Simon Pilgrim 4ddc92b6cd [X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32
As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types.

This patch adds support for fptosi to 2i32 from both 2f64 and 2f32.

It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch.

Differential Revision: https://reviews.llvm.org/D23808

llvm-svn: 284459
2016-10-18 07:42:15 +00:00
Tobias Grosser 2bbec0ee7f [SCEV] Consider delinearization pattern with extension with identity factor
Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor.

Reviewers: sanjoy, grosser

Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser

Differential Revision: https://reviews.llvm.org/D16492

llvm-svn: 284378
2016-10-17 11:56:26 +00:00
Tom Stellard 17eb3413cd [ValueTracking] Fix crash in GetPointerBaseWithConstantOffset()
Summary:
While walking defs of pointer operands we were assuming that the pointer
size would remain constant.  This is not true, because addresspacecast
instructions may cast the pointer to an address space with a different
pointer width.

This partial reverts r282612, which was a more conservative solution
to this problem.

Reviewers: reames, sanjoy, apilipenko

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24772

llvm-svn: 283557
2016-10-07 14:23:29 +00:00
Bjorn Pettersson 3961603921 [ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement.
Summary:
The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement.

Reviewers: majnemer, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24955

llvm-svn: 283434
2016-10-06 09:56:21 +00:00
Eli Friedman 74bed9d757 Make GlobalsAA ignore dead constant expressions.
Slightly improves the precision of GlobalsAA in certain situations, and
makes the behavior of optimization passes more predictable.

Differential Revision: https://reviews.llvm.org/D24104

llvm-svn: 283165
2016-10-04 00:03:55 +00:00
Sanjay Patel d27a21874b [x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433)
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30433

There are a couple of open questions about the codegen:
1. Should we let scalar ops be scalars and avoid vector constant loads/splats?
2. Should we have a pass to combine constants such as the inverted pair that we have here?

Differential Revision: https://reviews.llvm.org/D25165
 

llvm-svn: 283119
2016-10-03 16:38:27 +00:00
Simon Pilgrim 1ec20e9b0a [CostModel][X86] Added tests for current fptosi/fptoui costs
llvm-svn: 283047
2016-10-01 19:09:59 +00:00
Simon Pilgrim e0ec5c1f05 [CostModel][X86] Added fcopysign costs
llvm-svn: 283044
2016-10-01 16:41:52 +00:00
Simon Pilgrim 8b021c382d [CostModel][X86] Added fabs costs
llvm-svn: 283042
2016-10-01 16:30:13 +00:00
Jun Bum Lim 3822939ba7 Enhance calcColdCallHeuristics for InvokeInst
Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst.

Reviewers: mcrosier, hfinkel, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D24868

llvm-svn: 282262
2016-09-23 17:26:14 +00:00
Simon Pilgrim 9178059826 [CostModel][X86] Added scalar float op costs
llvm-svn: 281864
2016-09-18 21:01:20 +00:00
Sanjay Patel e104a5f35a auto-generate checks
llvm-svn: 281756
2016-09-16 17:54:52 +00:00
David L Kreitzer 8bbabee21a Reapplying r278731 after fixing the problem that caused it to be reverted.
Enhance SCEV to compute the trip count for some loops with unknown stride.

Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 281732
2016-09-16 14:38:13 +00:00
Wei Mi 24662395df Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.

This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.

Differential Revision: https://reviews.llvm.org/D24088

llvm-svn: 281439
2016-09-14 04:39:50 +00:00
Elena Demikhovsky 3622fbfc68 [Loop Vectorizer] Fixed memory confilict checks.
Fixed a bug in run-time checks for possible memory conflicts inside loop.
The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size".

Differential revision: https://reviews.llvm.org/D23176

llvm-svn: 279930
2016-08-28 08:53:53 +00:00
Wei Mi 59ca96636d [UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded
when unroll runtime iteration loop.

In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner
loop inside a loop nest, the scalar evolution needs to be dropped for its
parent loop which is done by ScalarEvolution::forgetLoop. However, we can
postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC
expansion can still reuse existing value.

Differential Revision: https://reviews.llvm.org/D23572

llvm-svn: 279748
2016-08-25 16:17:18 +00:00
Evgeny Stupachenko d7f9c3564a The patch improves ValueTracking on left shift with nsw flag.
Summary:
The patch fixes PR28946.

Reviewers: majnemer, sanjoy

Differential Revision: http://reviews.llvm.org/D23296

From: Li Huang
llvm-svn: 279684
2016-08-24 23:01:33 +00:00
Artur Pilipenko a1d9a67496 Remove missing file from r279433 reversal
llvm-svn: 279434
2016-08-22 13:18:19 +00:00
Simon Pilgrim 89e375a95e [CostModel][X86] Removed shift tests
There are more thorough tests found in vshift-*-cost.ll 

llvm-svn: 279406
2016-08-21 19:56:02 +00:00
Simon Pilgrim 6ad12ec629 [CostModel][X86] Added costs for vXi16 and vXi8 vectors for add/sub/mul/and/or/xor tests
llvm-svn: 279405
2016-08-21 19:44:44 +00:00
Simon Pilgrim b0a0576ffc [CostModel][X86] Replaced SSSE3 with SSE2 costs to create a better baseline
llvm-svn: 279404
2016-08-21 19:14:48 +00:00
Simon Pilgrim 07d7a21ea1 [CostModel][X86] Added fsqrt and fma costs
llvm-svn: 279403
2016-08-21 19:06:25 +00:00
Simon Pilgrim 3cd61a084f [CostModel][X86] Split off float arithmetic cost tests
llvm-svn: 279402
2016-08-21 18:34:47 +00:00
Simon Pilgrim 054e7d2ec1 [CostModel][X86] Added sub, or, and, fadd and fsub costs and missing 512-bit mul costs
llvm-svn: 279301
2016-08-19 19:07:10 +00:00
Simon Pilgrim fbfa3ee4f6 [CostModel][X86] Added some AVX512 and 512-bit vector cost tests
llvm-svn: 279291
2016-08-19 18:24:10 +00:00
Simon Pilgrim e309d2d0c3 [CostModel][X86] Add fdiv + frem cost tests
llvm-svn: 279283
2016-08-19 17:39:00 +00:00
Michael Kuperstein 41898f0396 [AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large.
Repeated inserts into AliasSetTracker have quadratic behavior - inserting a
pointer into AST is linear, since it requires walking over all "may" alias
sets and running an alias check vs. every pointer in the set.

We can avoid this by tracking the total number of pointers in "may" sets,
and when that number exceeds a threshold, declare the tracker "saturated".
This lumps all pointers into a single "may" set that aliases every other
pointer.

(This is a stop-gap solution until we migrate to MemorySSA)

This fixes PR28832.
Differential Revision: https://reviews.llvm.org/D23432

llvm-svn: 279274
2016-08-19 17:05:22 +00:00
Hans Wennborg 3879035e66 SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932)
Differential Revision: https://reviews.llvm.org/D23594

llvm-svn: 278999
2016-08-17 22:50:18 +00:00
Reid Kleckner b99b709068 Revert "Enhance SCEV to compute the trip count for some loops with unknown stride."
This reverts commit r278731. It caused http://crbug.com/638314

llvm-svn: 278853
2016-08-16 21:02:04 +00:00
Sanjoy Das 78db2963f6 Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag"
This reverts commit r278172.  It causes PR28946.

llvm-svn: 278740
2016-08-15 21:01:31 +00:00
David L Kreitzer 7fe18251a5 Enhance SCEV to compute the trip count for some loops with unknown stride.
Patch by Pankaj Chawla

Differential Revision: https://reviews.llvm.org/D22377

llvm-svn: 278731
2016-08-15 20:21:41 +00:00
Eli Friedman a6707f56b5 [DSE] Don't remove stores made live by a call which unwinds.
Issue exposed by noalias or more aggressive alias analysis.

Fixes http://llvm.org/PR25422.

Differential revision: https://reviews.llvm.org/D21007

llvm-svn: 278451
2016-08-12 01:09:53 +00:00
Andrew Kaylor b10f6876cd [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 278267
2016-08-10 18:47:19 +00:00
Andrew Kaylor 3c05edfd5e [ValueTracking] Improve ValueTracking on left shift with nsw flag
Patch by Li Huang

Differential Revison: https://reviews.llvm.org/D23296

llvm-svn: 278172
2016-08-09 22:41:35 +00:00
Wei Mi 575435012c Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The patch is to fix the bug in PR28705. It was caused by setting wrong return
value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion
have different meanings when the function is used in different ways so it is easy to make
mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion,
and specifies where each interface is expected to be used.

Differential Revision: https://reviews.llvm.org/D22942

llvm-svn: 278161
2016-08-09 20:40:03 +00:00
Wei Mi 785858cf6c Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion".
The fix for PR28705 will be committed consecutively.

In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 278160
2016-08-09 20:37:50 +00:00
Sanjoy Das d4c85af7fd [SCEV] Un-grep'ify tests; NFC
llvm-svn: 277861
2016-08-05 20:33:49 +00:00
Sanjoy Das b0b4e86215 [SCEV] Don't infinitely recurse on unreachable code
llvm-svn: 277848
2016-08-05 18:34:14 +00:00
Michael Kuperstein 3ceac2bbd5 [LV, X86] Be more optimistic about vectorizing shifts.
Shifts with a uniform but non-constant count were considered very expensive to
vectorize, because the splat of the uniform count and the shift would tend to
appear in different blocks. That made the splat invisible to ISel, and we'd
scalarize the shift at codegen time.

Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we
are able to select the appropriate vector shifts. This updates the cost model to
to take this into account by making shifts by a uniform cheap again.

Differential Revision: https://reviews.llvm.org/D23049

llvm-svn: 277782
2016-08-04 22:48:03 +00:00
Simon Pilgrim c8fe132756 [X86] Dropped XOP ctbits checks - they match the AVX checks
llvm-svn: 277718
2016-08-04 11:04:13 +00:00
Simon Pilgrim 5d5ca9c0cb [X86][SSE] Add initial costs for vector CTTZ/CTLZ
llvm-svn: 277716
2016-08-04 10:51:41 +00:00
George Burgess IV 5f0e76dca6 [CFLAA] Remove modref queries from CFLAA.
As it turns out, modref queries are broken with CFLAA. Specifically,
the data source we were using for determining modref behaviors
explicitly ignores operations on non-pointer values. So, it wouldn't
note e.g. storing an i32 to an i32* (or loading an i64 from an i64*).
It also ignores external function calls, rather than acting
conservatively for them.

(N.B. These operations, where necessary, *are* tracked by CFLAA; we just
use a different mechanism to do so. Said mechanism is relatively
imprecise, so it's unlikely that we can provide reasonably good modref
answers with it as implemented.)

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22978

llvm-svn: 277366
2016-08-01 18:47:28 +00:00
Adam Nemet aa3506c5f0 [BPI] Add new LazyBPI analysis
Summary:
The motivation is the same as in D22141: In order to add the hotness
attribute to optimization remarks we need BFI to be available in all
passes that emit optimization remarks.  BFI depends on BPI so unless we
make this lazy as well we would still compute BPI unconditionally.

The solution is to use the new LazyBPI pass in LazyBFI and only compute
BPI when computation of BFI is requested by the client.

I extended the laziness test using a LoopDistribute test to also cover
BPI.

Reviewers: hfinkel, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22835

llvm-svn: 277083
2016-07-28 23:31:12 +00:00
George Burgess IV dbd35c44d4 [CFLAA] Add getModRefBehavior to CFLAnders.
This patch lets CFLAnders respond to mod-ref queries. It also includes
a small bugfix to CFLSteens.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22823

llvm-svn: 276939
2016-07-27 23:07:07 +00:00
Hans Wennborg 685e8ff953 Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion."
It causes Clang tests to fail after Windows self-host (PR28705).

(Also reverts follow-up r276139.)

llvm-svn: 276822
2016-07-26 23:25:13 +00:00
Wei Mi 97de034e18 Remove useless pass from the pipeline in test/Analysis/Dominators/2007-01-14-BreakCritEdges.ll.
llvm-svn: 276644
2016-07-25 16:27:34 +00:00
Sanjoy Das a7d9ec8751 [SCEV] Make isImpliedCondOperandsViaRanges smarter
This change lets us prove things like

  "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow"

It does this by replacing replacing getConstantDifference by
computeConstantDifference (which is smarter) in
isImpliedCondOperandsViaRanges.

llvm-svn: 276505
2016-07-23 00:54:36 +00:00
Wei Mi e04d0eff29 [PM] Port BreakCriticalEdges to the new PM.
Differential Revision: https://reviews.llvm.org/D22688

llvm-svn: 276449
2016-07-22 18:04:25 +00:00
Wei Mi 481232e991 Fix test/Analysis/ScalarEvolution/scev-expander-existing-value-offset.ll for rL276136.
The content in this testcase was accidentally duplicated. Fix the error.

llvm-svn: 276139
2016-07-20 16:54:58 +00:00
Wei Mi db80c0c77f Use ValueOffsetPair to enhance value reuse during SCEV expansion.
In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion.
However, const folding and sext/zext distribution can make the reuse still difficult.

A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and
  S1 = S2 + C_a
  S3 = S2 + C_b
where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as
V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a
complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused
by the fact that S3 is generated from S1 after const folding.

In order to do that, we represent ExprValueMap as a mapping from SCEV to
ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the
ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first
expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to
V1 - C_a + C_b.

Differential Revision: https://reviews.llvm.org/D21313

llvm-svn: 276136
2016-07-20 16:40:33 +00:00
Simon Pilgrim 1b4f511aaa [X86][SSE] Add cost model values for CTPOP of vectors
This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better.

Differential Revision: https://reviews.llvm.org/D22456

llvm-svn: 276104
2016-07-20 10:41:28 +00:00
George Burgess IV 8b85321bae [CFLAA] Make a test tell the truth. NFC.
Dishonesty noted by Jia Chen.

llvm-svn: 276028
2016-07-19 20:56:41 +00:00
George Burgess IV 3b059841ff [CFLAA] Add some interproc. analysis to CFLAnders.
This patch adds function summary support to CFLAnders. It also comes
with a lot of tests! Woohoo!

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22450

llvm-svn: 276026
2016-07-19 20:47:15 +00:00
Simon Pilgrim 47638635cc [X86] Add CTPOP/CTLZ/CTTZ scalar cost tests
llvm-svn: 275725
2016-07-17 18:29:19 +00:00
George Burgess IV 22682e293b [CFLAA] Add attributes handling for CFLAnders.
This patch adds proper handling of stratified attributes into our
anders-style CFLAA implementation. It also comes bundled with more
CFLAnders tests. :)

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22325

llvm-svn: 275604
2016-07-15 20:02:49 +00:00
George Burgess IV 6d30aa03a0 [CFLAA] Add an initial CFLAnders implementation.
This adds an incomplete anders-style implementation for CFLAA. It's
incomplete in that it's missing interprocedural analysis, attrs
handling, etc. and that it needs more tests. More tests and features
will be added in future commits.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22291

llvm-svn: 275602
2016-07-15 19:53:25 +00:00
Tom Stellard 1b5cf6217e GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals
Summary:
In preparation for changing GlobalsAA to stop assuming that intrinsics
can't read arbitrary globals, we need to make sure GlobalsAA is querying
function attributes rather than relying on this assumption.

This patch was inspired by: http://reviews.llvm.org/D20206

Reviewers: jmolloy, hfinkel

Subscribers: eli.friedman, llvm-commits

Differential Revision: https://reviews.llvm.org/D21318

llvm-svn: 275433
2016-07-14 15:50:27 +00:00
Adam Nemet c2f791d8a7 [BFI] Add new LazyBFI analysis pass
Summary:
This is necessary for D21771.  In order to add the hotness attribute to
optimization remarks we need BFI to be available in all passes that emit
optimization remarks.

However we don't want to pay for computing BFI unless the hotness
attribute is requested.

This is achieved by making BFI lazy at the very high-level through a new
analysis pass -- BFI is not calculated unless requested.

I am adding a test to check the laziness under D21771 where the first
user of the analysis is added.

Reviewers: hfinkel, dexonsmith, davidxl

Subscribers: davidxl, dexonsmith, llvm-commits

Differential Revision: http://reviews.llvm.org/D22141

llvm-svn: 275250
2016-07-13 05:01:48 +00:00
Keno Fischer 1efc3b70c5 Fix ScalarEvolutionExpander step scaling bug
The expandAddRecExprLiterally function incorrectly transforms
`[Start + Step * X]` into `Step * [Start + X]` instead of the correct
transform of `[Step * X] + Start`.

This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219
due to what appeared to be sufficiently complicated loop interactions.

Patch by Jameson Nash (jameson@juliacomputing.com).

Reviewers: sanjoy
Differential Revision: http://reviews.llvm.org/D16505

llvm-svn: 275239
2016-07-13 01:28:12 +00:00
Michael Kuperstein f0c59330e9 [X86] Make some cast costs more precise
Make some AVX and AVX512 cast costs more precise.
Based on part of a patch by Elena Demikhovsky (D15604).

Differential Revision: http://reviews.llvm.org/D22064

llvm-svn: 275106
2016-07-11 21:39:44 +00:00
Hal Finkel bf3957a553 Teach isDereferenceablePointer to look through returned-argument functions
For functions which are known to return their argument,
isDereferenceableAndAlignedPointer can examine the argument value.

Differential Revision: http://reviews.llvm.org/D9384

llvm-svn: 275038
2016-07-11 03:08:49 +00:00
Hal Finkel e186debb8b Teach SCEV to look through returned-argument functions
When building SCEVs, if a function is known to return its argument, then we can
build the SCEV using the corresponding argument value.

Differential Revision: http://reviews.llvm.org/D9381

llvm-svn: 275037
2016-07-11 02:48:23 +00:00
Hal Finkel 5c12d8fe8f BasicAA should look through functions with returned arguments
Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look
through returned-argument functions when answering queries. This is essential
so that we don't loose all other AA information when supplementing with
llvm.noalias.

Differential Revision: http://reviews.llvm.org/D9383

llvm-svn: 275035
2016-07-11 01:32:20 +00:00