Commit Graph

220 Commits

Author SHA1 Message Date
Craig Topper 9256ac1a58 [X86] Add 512-bit unmasked pmulhrsw/pmulhw/pmulhuw intrinsics. Remove and auto upgrade 128/256/512 bit masked pmulhrsw/pmulhw/pmulhuw intrinsics.
The 128 and 256 bit versions were already not used by clang. This adds an equivalent unmasked 512 bit version. Then autoupgrades all sizes to use unmasked intrinsics plus select.

llvm-svn: 325559
2018-02-20 07:28:14 +00:00
Craig Topper 8d19c6fba2 [X86] Reverse the operand order of the autoupgrade of the kunpack builtins.
The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior.

Fixes PR36360.

llvm-svn: 324953
2018-02-12 22:38:34 +00:00
Craig Topper 4dccffc84a [X86] Change signatures of avx512 packed fp compare intrinsics to return a vXi1 mask type to be closer to an fcmp.
Summary:
This patch changes the signature of the avx512 packed fp compare intrinsics to return a vXi1 vector and no longer take a mask as input. The casts to scalar type will now need to be explicit in the IR. The masking node will now be an explicit and in the IR.

This makes the intrinsic look much more similar to an fcmp instruction that we wish we could use for these but can't. We already use icmp instructions for integer compares.

Previously the lowering step of isel would turn the intrinsic into an X86 specific ISD node and a emit the masking nodes as well as some bitcasts. This means DAG combines can't see the vXi1 type until somewhat late, making it more difficult to combine out gpr<->mask transition sequences. By exposing the vXi1 type explicitly in the IR and initial SelectionDAG we give earlier DAG combines and even InstCombine the chance to see it and optimize it.

This should make any issues with gpr<->mask sequences the same between integer and fp. Meaning we only have to fix them once.

Reviewers: spatel, delena, RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43137

llvm-svn: 324827
2018-02-10 23:33:55 +00:00
Craig Topper dccf72b583 [X86] Remove kortest intrinsics and replace with native IR.
llvm-svn: 324646
2018-02-08 20:16:06 +00:00
Craig Topper 071ad9c6e0 [X86] Remove and autoupgrade kand/kandn/kor/kxor/kxnor/knot intrinsics.
Clang already stopped using these a couple months ago.

The test cases aren't great as there is nothing forcing the operations to stay in k-registers so some of them moved back to scalar ops due to the bitcasts being moved around.

llvm-svn: 324177
2018-02-03 20:18:25 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Craig Topper 7197a452fc [X86] Autoupgrade kunpck intrinsics using vector operations instead of scalar operations
Summary: This patch changes the kunpck intrinsic autoupgrade to use vXi1 shufflevector operations to perform vector extracts and concats. This more closely matches the definition of the kunpck instructions. Currently we rely on a DAG combine to turn the scalar shift/and/or code into a concat vectors operation. By doing it in the IR we get this for free.

Reviewers: spatel, RKSimon, zvi, jina.nahias

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42018

llvm-svn: 322462
2018-01-14 19:24:10 +00:00
Craig Topper cc342d465e [X86] Remove llvm.x86.avx512.cvt*2mask.* intrinsics and autoupgrade to (icmp slt X, 0)
I had to drop fast-isel-abort from a test because we can't fast isel some of the mask stuff. When we used intrinsics we implicitly fell back to SelectionDAG for the intrinsic call without triggering the abort error. But with native IR that doesn't happen the same way.

llvm-svn: 322050
2018-01-09 00:50:47 +00:00
Michael Zolotukhin f05cb4374d Remove redundant includes from lib/IR.
llvm-svn: 320622
2017-12-13 21:30:52 +00:00
Craig Topper c3c3ebf29d [X86] Attempt to fix a ubsan failure in the autoupgrade of kunpck intrinsics.
llvm-svn: 319911
2017-12-06 17:54:07 +00:00
Jina Nahias 51c1a627c2 [x86][AVX512] Lowering kunpack intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D39719), implements the lowering of X86 kunpack intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D39720

Change-Id: I4088d9428478f9457f6afddc90bd3d66b3daf0a1
llvm-svn: 319778
2017-12-05 15:42:56 +00:00
Uriel Korach 2aa707bdaa [X86] test/testn intrinsics lowering to IR. llvm part.
Remove builtins from llvm and add AutoUpgrade support.
Also add fast-isel tests for the TEST and TESTN instructions.

Differential Revision: https://reviews.llvm.org/D38736

llvm-svn: 318036
2017-11-13 12:51:18 +00:00
Jina Nahias 9a7f9f123c [x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D38672), implements the lowering of X86 shuffle i/f intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D38671

Change-Id: I1e7d359a74743e995ec356237a85214ce55d3661
llvm-svn: 318026
2017-11-13 09:16:39 +00:00
Jina Nahias 7b705f1f91 [x86][AVX512] Lowering Broadcastm intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D38683), implements the lowering of X86 broadcastm intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D38684

Change-Id: I709ac0b34641095397e994c8ff7e15d1315b3540
llvm-svn: 317458
2017-11-06 07:09:24 +00:00
Saleem Abdulrasool 46a59fdab6 Bitcode: add an auto-upgrade for LTO section name
The bitcode reader looks specifically for `__DATA, __objc_catlist` as a
section name.  However, SVN r304661 removed the spaces (the two names
are functionally equivalent but do not compare equally
lexicographically).  This causes compatibility issues.  Add an
auto-upgrade path for removing the spaces as well as use the new name in
the LTO plugin.

llvm-svn: 315086
2017-10-06 18:06:59 +00:00
Adrian Prantl a8b2ddbde4 Move the stripping of invalid debug info from the Verifier to AutoUpgrade.
This came out of a recent discussion on llvm-dev
(https://reviews.llvm.org/D38042). Currently the Verifier will strip
the debug info metadata from a module if it finds the dbeug info to be
malformed. This feature is very valuable since it allows us to improve
the Verifier by making it stricter without breaking bcompatibility,
but arguable the Verifier pass should not be modifying the IR. This
patch moves the stripping of broken debug info into AutoUpgrade
(UpgradeDebugInfo to be precise), which is a much better location for
this since the stripping of malformed (i.e., produced by older, buggy
versions of Clang) is a (harsh) form of AutoUpgrade.

This change is mostly NFC in nature, the one big difference is the
behavior when LLVM module passes are introducing malformed debug
info. Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info, after this patch the Verifier
will report a fatal error. I believe this behavior is actually more
desirable anyway.

Differential Revision: https://reviews.llvm.org/D38184

llvm-svn: 314699
2017-10-02 18:31:29 +00:00
Uriel Korach 0ecc984b1b [X86] Finishing broadcastf32x2 and broadcasti32x2 intrinsics lowering to IR. llvm side.
Removing X86 broadcast(f/i)32x2 intrinsics from llvm.
Adding autoUpgrade support.
Moving matching tests from avx512dq-intrinsics.ll to avx512dq-intrinsics-upgrade.ll and from avx512dqvl-intrinsics.ll to avx512dqvl-intrinsics-upgrade.ll.

Differential Revision: https://reviews.llvm.org/D38220

llvm-svn: 314195
2017-09-26 07:39:39 +00:00
Jina Nahias ccfb8d4fe8 [x86] Lowering Mask Set1 intrinsics to LLVM IR
This patch, together with a matching clang patch (https://reviews.llvm.org/D37668), implements the lowering of X86 mask set1 intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D37669

llvm-svn: 313625
2017-09-19 11:03:06 +00:00
Craig Topper f264fcc704 [X86] Remove VPERM2F128/VPERM2I128 intrinsics and autoupgrade to native shuffles.
I've moved the test cases from the InstCombine optimizations to the backend to keep the coverage we had there. It covered every possible immediate so I've preserved the resulting shuffle mask for each of those immediates.

llvm-svn: 313450
2017-09-16 07:36:14 +00:00
Steven Wu ab211df5de [AutoUpgrade] Fix a compatibility issue with module flag
Summary:
After r304661, module flag to record objective-c image info section is
encoded without whitespaces after comma. The new name is equivalent to
the old one, except that when LTO a module built by old compiler and a
module built by a new compiler, it will fail with conflicting values.

Fix the issue by removing whitespaces in bitcode upgrade path.

rdar://problem/34416934

Reviewers: compnerd

Reviewed By: compnerd

Subscribers: mehdi_amini, hans, llvm-commits

Differential Revision: https://reviews.llvm.org/D37909

llvm-svn: 313398
2017-09-15 21:12:14 +00:00
Uriel Korach 5d5da5f531 [X86] [PATCH] [intrinsics] Lowering X86 ABS intrinsics to IR. (llvm)
This patch, together with a matching clang patch (https://reviews.llvm.org/D37694), implements the lowering of X86 ABS intrinsics to IR.

differential revision: https://reviews.llvm.org/D37693.

llvm-svn: 313134
2017-09-13 09:02:36 +00:00
Yael Tsafrir 47668b5e03 [X86] Lower _mm[256|512]_[mask[z]]_avg_epu[8|16] intrinsics to native llvm IR
Differential Revision: https://reviews.llvm.org/D37560

llvm-svn: 313013
2017-09-12 07:50:35 +00:00
Uriel Korach 01dfd3d1e3 Revert "adding autoUpgrade support to broadcast[f|i]32x2 intrinsics"
This reverts commit r312879 - An accidental partial commit.

llvm-svn: 312880
2017-09-10 09:07:21 +00:00
Uriel Korach 3eb10a79e5 adding autoUpgrade support to broadcast[f|i]32x2 intrinsics
llvm-svn: 312879
2017-09-10 08:40:13 +00:00
Steven Wu 010fc49e42 [IR] AutoUpgrade ModuleFlagBehavior for PIC and PIE level
Summary:
From r303590, ModuleFlagBehavior for PIC and PIE level is changed from
Error to Max. This will cause bitcode compatibility issue when linking
against a bitcode static archive built with old compiler.
Add an auto-ugprade path to upgrade the the ModuleFlagBehavior in the
old bitcode to match the new one so IRLinker can link them.

Reviewers: tejohnson, mehdi_amini, dexonsmith

Reviewed By: dexonsmith

Subscribers: hans, llvm-commits

Differential Revision: https://reviews.llvm.org/D36556

llvm-svn: 311387
2017-08-21 21:49:13 +00:00
Craig Topper 561092f233 [AVX512] Remove and autoupgrade many of the broadcast intrinsics
Summary:
This autoupgrades most of the broadcast intrinsics. They've been unused in clang for some time.

This leaves the 32x2 intrinsics because they are still used in clang.

Reviewers: RKSimon, zvi, igorb

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36606

llvm-svn: 310725
2017-08-11 16:22:45 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Craig Topper 792fc92be2 [AVX-512] Remove and autoupgrade the masked integer compare intrinsics
Summary:
These intrinsics aren't used by clang and haven't been for a while.

There's some really terrible codegen in the 32-bit target for avx512bw due to i64 not being legal. But as I said these intrinsics aren't used by clang even before this patch so this codegen reflects our clang behavior today.

Reviewers: spatel, RKSimon, zvi, igorb

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34389

llvm-svn: 306047
2017-06-22 20:11:01 +00:00
Galina Kistanova f525c76ba1 Added missing break.
llvm-svn: 303454
2017-05-19 20:31:51 +00:00
Elad Cohen ef5798acf5 Support arbitrary address space pointers in masked gather/scatter intrinsics.
Fixes PR31789 - When loop-vectorize tries to use these intrinsics for a
non-default address space pointer we fail with a "Calling a function with a
bad singature!" assertion. This patch solves this by adding the 'vector of
pointers' argument as an overloaded type which will determine the address
space.

Differential revision: https://reviews.llvm.org/D31490

llvm-svn: 302018
2017-05-03 12:28:54 +00:00
Simon Pilgrim 5a22eaa2bf [X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM)
MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics.

Clang companion patch: D31766.

Differential Revision: https://reviews.llvm.org/D31767

llvm-svn: 300325
2017-04-14 15:05:35 +00:00
Matt Arsenault f10061ec70 Add address space mangling to lifetime intrinsics
In preparation for allowing allocas to have non-0 addrspace.

llvm-svn: 299876
2017-04-10 20:18:21 +00:00
Michael Zuckerman 88fb171015 [X86][LLVM] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into generic intrinsics.
This patch is a part one of two reviews, one for the clang and the other for LLVM. 
The patch deletes the back-end intrinsics and adds support for them in the auto upgrade.

Differential Revision: https://reviews.llvm.org/D31393

llvm-svn: 299432
2017-04-04 13:32:14 +00:00
George Burgess IV 56c7e88c2c Let llvm.objectsize be conservative with null pointers
This adds a parameter to @llvm.objectsize that makes it return
conservative values if it's given null.

This fixes PR23277.

Differential Revision: https://reviews.llvm.org/D28494

llvm-svn: 298430
2017-03-21 20:08:59 +00:00
Daniel Berlin 3f91004ce7 Keep attributes, calling convention, etc, when remangling intrinsic
Summary: Fix issue reported where intrinsic calling convention is dropped after r295253.

Reviewers: sanjoy

Subscribers: materi, llvm-commits

Differential Revision: https://reviews.llvm.org/D30422

llvm-svn: 296563
2017-03-01 01:49:13 +00:00
Craig Topper c43f3f3291 [IR][X86] Fix llvm version number in comments in AutoUpgrade. Forgot the next release is 5.0 not 4.1
llvm-svn: 296092
2017-02-24 05:35:07 +00:00
Craig Topper f2529c188b [AVX-512] Remove lzcnt intrinsics and autoupgrade them to generic ctlz intrinsics with select.
Clang has been emitting cltz intrinsics for a while now.

llvm-svn: 296091
2017-02-24 05:35:04 +00:00
Craig Topper 185ced8b2b [X86][IR] In AutoUpgrade, check explicitly for xop.vpcmov and xop.vpcmov.256 instead of anything starting with xop.vpcmov
There were some older intrinsics that only existed for less than a month in 2012 that still exist in some out of tree test files that start with this string, but aren't able to be handled by the current upgrade code and fire an assert. Now we'll go back to treating them as not intrinsics at all and just passing them through to output.

Fixes PR32041, sort of.

llvm-svn: 295930
2017-02-23 03:22:14 +00:00
Craig Topper de10312bea Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
Clang has now been fixed to not use these intrinsics.

llvm-svn: 295571
2017-02-18 21:50:58 +00:00
Craig Topper ba2a726cc6 Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support.

llvm-svn: 295565
2017-02-18 20:14:20 +00:00
Craig Topper 884db3f85d [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.
It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses.

llvm-svn: 295564
2017-02-18 19:51:25 +00:00
Craig Topper 03a9adc2ba [X86][IR] Simplify the XOP vpcmov autoupgrade code. NFC
llvm-svn: 295563
2017-02-18 19:51:19 +00:00
Craig Topper aa49f14496 [X86][IR] Merge together some very similar AutoUpgrade handling. NFC
llvm-svn: 295562
2017-02-18 19:51:14 +00:00
Craig Topper a505169ca5 [AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to legacy unmasked intrinsics and select instructions.
llvm-svn: 295543
2017-02-18 07:07:50 +00:00
Craig Topper cbd1b60e42 [IR][X86] Simplify some AutoUpgrade code slightly. NFC
llvm-svn: 295426
2017-02-17 07:07:24 +00:00
Craig Topper 905cc75f97 [IR][X86] Rename an AutoUpgrade helper function to more accurately match what intrinsics it handles. NFC
llvm-svn: 295425
2017-02-17 07:07:21 +00:00
Craig Topper b9b9cb0ce6 [IR][X86] Move X86 specific portions of UpgradeIntrinsicFunction1 to a couple helper functions. NFC
This enables some early outs to avoid repeatedly using IsX86 check to qualify. I hope to continue to improve this to shorten the lengths of some of the string comparisons.

llvm-svn: 295424
2017-02-17 07:07:19 +00:00
Craig Topper 715873ead3 [AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics.
The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO.

llvm-svn: 295290
2017-02-16 06:31:54 +00:00
Daniel Berlin 3c1432fecf Implement intrinsic mangling for literal struct types.
Fixes PR 31921

Summary:
Predicateinfo requires an ugly workaround to try to avoid literal
struct types due to the intrinsic mangling not being implemented.
This workaround actually does not work in all cases (you can hit the
assert by bootstrapping with -print-predicateinfo), and can't be made
to work without DFS'ing the type (IE copying getMangledStr and using a
version that detects if it would crash).

Rather than do that, i just implemented the mangling.  It seems
simple, since they are unified structurally.

Looking at the overloaded-mangling testcase we have, it actually turns
out the gc intrinsics will *also* crash if you try to use a literal
struct.  Thus, the testcase added fails before this patch, and works
after, without needing to resort to predicateinfo.

Reviewers: chandlerc, davide

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D29925

llvm-svn: 295253
2017-02-15 23:16:20 +00:00
Justin Lebar 46624a822d [NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code.
Summary:
Specifically, we upgrade llvm.nvvm.:

 * brev{32,64}
 * clz.{i,ll}
 * popc.{i,ll}
 * abs.{i,ll}
 * {min,max}.{i,ll,u,ull}
 * h2f

These either map directly to an existing LLVM target-generic
intrinsic or map to a simple LLVM target-generic idiom.

In all cases, we check that the code we generate is lowered to PTX as we
expect.

These builtins don't need to be backfilled in clang: They're not
accessible to user code from nvcc.

Reviewers: tra

Subscribers: majnemer, cfe-commits, llvm-commits, jholewinski

Differential Revision: https://reviews.llvm.org/D28793

llvm-svn: 292694
2017-01-21 01:00:32 +00:00