Commit Graph

2428 Commits

Author SHA1 Message Date
Sanjay Patel 12a2af447b [InstCombine] add test to show missing vector optimization; NFC
llvm-svn: 287982
2016-11-26 16:13:23 +00:00
Sanjay Patel 8bd69b7ed9 [InstCombine] don't drop metadata in FoldOpIntoSelect()
llvm-svn: 287980
2016-11-26 15:23:20 +00:00
Sanjay Patel e359eaaf70 [InstCombine] change bitwise logic type to eliminate bitcasts
In PR27925:
https://llvm.org/bugs/show_bug.cgi?id=27925

...we proposed adding this fold to eliminate a bitcast. In D20774, there was 
some concern about changing the type of a bitwise op as well as creating 
bitcasts that might not be free for a target. However, if we're strictly 
eliminating an instruction (by limiting this to one-use ops), then we should 
be able to do this in InstCombine.

But we're cautiously restricting the transform for now to vector types to
avoid possible backend problems. A transform to make sure the logic op is
legal for the target should be added to reverse this transform and improve
codegen.

Differential Revision: https://reviews.llvm.org/D26641

llvm-svn: 287707
2016-11-22 22:05:48 +00:00
Sanjay Patel 3b0bafee63 [InstCombine] canonicalize min/max constant to select's false value
This is a first step towards canonicalization and improved folding/codegen
for integer min/max as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html

Here, we're just matching the simplest min/max patterns and adjusting the
icmp predicate while swapping the select operands.

I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll
so it's easier to see how this might be extended (corresponds to the TODO
comment in the code). That's also why I'm using matchSelectPattern()
rather than a simpler check; once the backend is patched, we can just 
remove some of the restrictions to allow the obfuscated min/max patterns
in the FIXME tests to be matched.

Differential Revision: https://reviews.llvm.org/D26525

llvm-svn: 287585
2016-11-21 22:04:14 +00:00
Yaxun Liu 02f75f31e0 Fix known zero bits for addrspacecast.
Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1.

This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand.

Differential Revision: https://reviews.llvm.org/D26803

llvm-svn: 287545
2016-11-21 15:42:31 +00:00
Sanjay Patel 47e577eb92 [InstCombine] add tests to show likely unwanted select widening; NFC
This is a prerequisite patch for D26556:
https://reviews.llvm.org/D26556

...because there was no direct coverage for these folds (which in some cases are adding instructions).

llvm-svn: 287400
2016-11-18 23:22:00 +00:00
Craig Topper 1de753f7f5 [InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics for variable shift with 16-bit elements.
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.

llvm-svn: 287316
2016-11-18 06:04:33 +00:00
Craig Topper 6910fa0ef4 [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.

Reviewers: zvi, delena, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26660

llvm-svn: 287083
2016-11-16 05:24:10 +00:00
Sanjay Patel bb238bb4e5 [InstCombine] add tests for bitcasted selects; NFC
llvm-svn: 286978
2016-11-15 16:01:16 +00:00
Sanjay Patel aaa06fa486 [InstCombine] add tests to show missing bitcast folds
llvm-svn: 286900
2016-11-14 22:44:06 +00:00
Craig Topper b4173a5a70 [InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked AVX-512 variable shift intrinsics.
llvm-svn: 286755
2016-11-13 07:26:19 +00:00
Craig Topper 8b831cbb2a [InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 shift by immediate and shift by single value.
This does not include support for the AVX-512 variable shifts. That will be coming in a future patch.

llvm-svn: 286739
2016-11-13 01:51:55 +00:00
Sanjay Patel 8e68394376 [InstCombine] update test to use FileCheck; NFC
llvm-svn: 286668
2016-11-11 23:12:46 +00:00
Sanjay Patel da0149dd74 [InstCombine] add tests to show size-increasing select transforms
llvm-svn: 286619
2016-11-11 19:37:54 +00:00
Sanjay Patel 40d33e7554 [InstCombine] auto-generate better checks; NFC
Note that the existing metadata checking was re-added by hand because the 
script doesn't currently know how to generate checks for lines outside of 
functions.

llvm-svn: 286460
2016-11-10 14:58:17 +00:00
Sanjay Patel 4e1b5a53c7 [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.

https://llvm.org/bugs/show_bug.cgi?id=30923

llvm-svn: 286423
2016-11-10 00:15:14 +00:00
Sanjay Patel 600631daf3 [InstCombine] regenerate checks; NFC
llvm-svn: 286402
2016-11-09 22:21:58 +00:00
Sanjay Patel 16da6c466f [InstCombine] regenerate checks; NFC
llvm-svn: 286399
2016-11-09 21:41:34 +00:00
Sanjay Patel 4e9d6cd354 [InstCombine] fix profitability equation for max-of-nots transform
As the test change shows, we can increase the critical path by adding
a 'not' instruction, so make sure that we're actually removing an
instruction if we do this transform.

This transform could also cause us to miss folds of min/max pairs.

llvm-svn: 286315
2016-11-09 00:13:11 +00:00
Davide Italiano 1e77aaca8a [LibcallsShrinkWrap] This pass doesn't preserve the CFG.
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.

Differential Revision:  https://reviews.llvm.org/D26381

llvm-svn: 286271
2016-11-08 19:18:20 +00:00
Sanjay Patel 8625c43662 [InstCombine] move min/max tests to min/max test file; NFC
llvm-svn: 286256
2016-11-08 18:12:19 +00:00
Sanjay Patel 686cf49f7a [InstCombine] update checks; NFC
llvm-svn: 286255
2016-11-08 18:06:14 +00:00
Sanjay Patel 86408a8048 [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.

llvm-svn: 286113
2016-11-07 15:52:45 +00:00
Greg Bedwell 5fc6f94591 Revert "[InstCombine] allow splat vector folds in adjustMinMax()"
This reverts commit r285732.

This change introduced a new assertion failure in the following
testcase at -O2:

typedef short __v8hi __attribute__((__vector_size__(16)));
__v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) {
  __v8hi Result = V1;
  if (mask & 0x80)
    Result[0] = V2[0];
  return Result;
}

llvm-svn: 285866
2016-11-02 23:17:05 +00:00
Sanjay Patel c3d89842ad [InstCombine] allow splat vector folds in adjustMinMax()
llvm-svn: 285732
2016-11-01 20:08:02 +00:00
Sanjay Patel c0339c77ef [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.
This transforms

%a = shl nuw %x, c1
%b = icmp {ugt|ule} %a, c0

into

%b = icmp {ugt|ule} %x, (c0 >> c1)

z3:

(declare-const x (_ BitVec 64))
(declare-const c0 (_ BitVec 64))
(declare-const c1 (_ BitVec 64))

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvugt (bvshl x c1) c0)
                (bvugt x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

(push)
(assert (= x (bvlshr (bvshl x c1) c1)))  ; nuw
(assert (not (= (bvule (bvshl x c1) c0)
                (bvule x
                       (bvlshr c0 c1)))))
(check-sat)
(get-model)
(pop)

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25913

llvm-svn: 285729
2016-11-01 19:19:29 +00:00
Sanjay Patel 700193ae05 [InstCombine] add vector tests for ext+adjust min/max
llvm-svn: 285713
2016-11-01 17:34:29 +00:00
Sanjay Patel 586aeee341 [InstCombine] move/fix tests for adjusted min/max
I think the former 'test50' had a typo making it functionally equivalent
to the former 'test49'; changed the predicate to provide more coverage.

llvm-svn: 285706
2016-11-01 16:39:30 +00:00
Sanjay Patel 04ef115ff7 [InstCombine] fix tests for adjusted min/max
1. Delete identical tests
2. Rename tests to reflect actual functionality
3. Add comments
4. Add unsigned variants
5. Add vector variants with FIXME comments
6. Rename test file

llvm-svn: 285699
2016-11-01 15:48:30 +00:00
Simon Pilgrim 6dd8fab443 [InstCombine] Folding of shifts by the sum of positive values
This patch introduces the combine:

(C1 shift (A add C2)) -> ((C1 shift C2) shift A)
iff A and C2 are both positive

If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift.

Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive).

Differential Revision: https://reviews.llvm.org/D26000

llvm-svn: 285696
2016-11-01 15:40:30 +00:00
Sanjay Patel 69587324e8 [InstCombine] auto-generate better checks
llvm-svn: 285693
2016-11-01 14:38:30 +00:00
Sanjay Patel 36eeb6d6f6 [ValueTracking] recognize more variants of smin/smax
Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. 
There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding
smax pattern and commuted variants.

The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We
can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR.

Differential Revision: https://reviews.llvm.org/D26091

llvm-svn: 285499
2016-10-29 16:21:19 +00:00
Sanjay Patel 978f827d12 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Sanjay Patel 19ace1d548 [InstCombine] move/add tests for smin/smax folds
llvm-svn: 285414
2016-10-28 16:54:03 +00:00
Davide Italiano 30665147f9 [ConstantFold] Get the correct vector type when folding a getelementptr.
Differential Revision:  https://reviews.llvm.org/D26014

llvm-svn: 285371
2016-10-28 00:53:16 +00:00
Davide Italiano e4146714ca Remove accidentally commited test.
llvm-svn: 285366
2016-10-27 23:40:19 +00:00
Davide Italiano e865a525df [IR] Reintroduce getGEPReturnType(), it will be used in a later patch.
llvm-svn: 285365
2016-10-27 23:38:51 +00:00
Sanjay Patel c0de9c9e40 [InstCombine] fix foldSPFofSPF() to handle vector splats
llvm-svn: 285345
2016-10-27 21:19:40 +00:00
Sanjay Patel 923f74b27c [InstCombine] add vector tests for foldSPFofSPF to show missing folds
llvm-svn: 285340
2016-10-27 20:51:03 +00:00
Sanjay Patel cc8e0af3e5 [InstCombine] auto-generate checks for min/max tests
llvm-svn: 285336
2016-10-27 19:54:15 +00:00
Sanjay Patel 611f9f92fc [InstCombine] handle simple vector integer constants in IsFreeToInvert
llvm-svn: 285318
2016-10-27 17:30:50 +00:00
Sanjay Patel e372aecb8a [ValueTracking] fix matchSelectPattern to allow vector splat folds of min/max/abs/nabs
llvm-svn: 285303
2016-10-27 15:26:10 +00:00
Sanjay Patel d5b8d64d4b [InstCombine] add tests for missing folds of vector abs/nabs/min/max
llvm-svn: 285299
2016-10-27 15:02:45 +00:00
Sanjay Patel f21dd2648f [InstCombine] auto-generate better checks; NFC
llvm-svn: 285293
2016-10-27 13:55:37 +00:00
Sanjay Patel 0bacecfb32 [InstCombine] consolidate zext tests and auto-generate checks; NFC
llvm-svn: 285195
2016-10-26 14:08:49 +00:00
Sanjay Patel 0b756ff1d1 [InstCombine] auto-generate better checks; NFC
llvm-svn: 285194
2016-10-26 13:58:22 +00:00
Guozhi Wei ae541f6a71 [InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996
The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996.

The problem is with following code

xB = load (type B); 
xA = load (type A); 
+yA = (A)xB; B -> A
+zAn = PHI[yA, xA]; PHI 
+zBn = (B)zAn; // A -> B
store zAn;
store zBn;

optimizeBitCastFromPhi generates

+zBn = (B)zAn; // A -> B

and expects it will be combined with the following store instruction to another

store zAn 

Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely.

optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions.

Differential Revision: https://reviews.llvm.org/D23896

llvm-svn: 285116
2016-10-25 20:43:42 +00:00
Sanjay Patel f3dda13bd2 [InstCombine] Ensure that truncated int types are legal.
Fixes the FIXMEs in D25952 and rL285075.

Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25955

llvm-svn: 285108
2016-10-25 20:11:47 +00:00
Sanjay Patel 8f9235dd7f [InstCombine] add tests for missing icmp + shl nuw fold
Patch by bryant!

Differential Revision: https://reviews.llvm.org/D25952

llvm-svn: 285095
2016-10-25 18:47:56 +00:00
Sanjay Patel d59f7f9047 [InstCombine] add test and code comment to show potentially misguided icmp trunc transform
llvm-svn: 285075
2016-10-25 15:16:39 +00:00