Commit Graph

104 Commits

Author SHA1 Message Date
Craig Topper 68a272d501 [X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins to a single version without masking. Use select builtins with appropriate operand instead.
llvm-svn: 333387
2018-05-29 03:26:38 +00:00
Craig Topper f2043b08b4 [X86] Remove mask argument from more builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
llvm-svn: 333061
2018-05-23 04:51:54 +00:00
Craig Topper 842171de36 [X86] Use __builtin_convertvector to implement some of the packed integer to packed float conversion intrinsics.
I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions.

We already do something similar for scalar conversions.

Differential Revision: https://reviews.llvm.org/D46863

llvm-svn: 332882
2018-05-21 20:19:17 +00:00
Craig Topper 55b4067350 [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead.
Someday maybe we'll use selects for all the builtins.

llvm-svn: 332825
2018-05-20 23:34:10 +00:00
Craig Topper 9d146bbaf7 [X86] Revert part of r332266: Use __builtin_convertvector to replace some of the avx512 truncate builtins.
The masking doesn't work right in the backend for the ones that produce byte or word elements without avx512bw.

llvm-svn: 332322
2018-05-15 03:17:52 +00:00
Craig Topper 25de41cfbc [X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins.
As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend.

Differential Revision: https://reviews.llvm.org/D46742

llvm-svn: 332266
2018-05-14 17:50:40 +00:00
Craig Topper ce281a41b5 [X86] Remove '#ifdef __x86_64__' around mask_set1_epi64 intrinsics.
The unmasked versions already didn't have this restrction. I don't think gcc or icc limit these to 64-bit mode so we shouldn't either.

llvm-svn: 330681
2018-04-24 03:36:08 +00:00
Craig Topper 21f66a3f6b [X86] Remove some masked cvt builtins that can be replaced with legacy sse/avx buiiltins and a select.
llvm-svn: 326039
2018-02-24 18:55:13 +00:00
Craig Topper 5dc6ca8e5b [X86] Remove __builtin_ia32_permvarsf256_mask and __builtin_ia32_permvarsi256_mask and use the avx2 unmasked versions and a select instead.
llvm-svn: 326022
2018-02-24 06:46:42 +00:00
Uriel Korach 5b2b71d909 [X86] test/testn intrinsics lowering to IR. clang side
Change Header files of the intrinsics for lowering test and testn intrinsics to IR code.
Removed test and testn builtins from clang

Differential Revision: https://reviews.llvm.org/D38737

llvm-svn: 318035
2017-11-13 12:50:52 +00:00
Jina Nahias dca979194d [x86][AVX512] Lowering shuffle i/f intrinsics to LLVM IR
This patch, together with a matching llvm patch (https://reviews.llvm.org/D38671), implements the lowering of X86 shuffle i/f intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D38672

Change-Id: I9b3c2f2b34323bd9ccb21d0c1832f848b88ec047
llvm-svn: 318025
2017-11-13 09:15:31 +00:00
Craig Topper 57f96ac6dc [X86] Replace the mask cmpeq/cmple/cmplt/cmpgt/cmpge/cmpneq intrinsics with macros that just pass the right comparison predicate value to the regular cmp intrinsic. Remove mask cmpeq/cmpgt builtins that are now unused.
This shortens the intrinsic headers a little and allows us to get rid of the cmpeq and cmpgt handling from CGBuiltin.cpp.

llvm-svn: 317506
2017-11-06 21:00:49 +00:00
Jina Nahias 123c599a0f fixing a bug in mask[z]_set1 intrinsic
Differential Revision: https://reviews.llvm.org/D38231

Change-Id: I80bbff9cbe93e4be54d8a761ef9723edf3f57c57
llvm-svn: 314102
2017-09-25 13:38:08 +00:00
Jina Nahias 3ad702a1ed Lowering Mask Set1 intrinsics to LLVM IR
This patch, together with a matching llvm patch (https://reviews.llvm.org/D37669), implements the lowering of X86 mask set1 intrinsics to IR.

Differential Revision: https://reviews.llvm.org/D37668

llvm-svn: 313624
2017-09-19 11:00:27 +00:00
Craig Topper 367c86ddbe [AVX-512] Replace subvector broadcast builtins with shufflevectors and selects.
Verified that the backend codegens this equally well.

llvm-svn: 292329
2017-01-18 02:17:10 +00:00
Craig Topper 5391c98341 [AVX-512] Remove 128/256-bit masked vpermilvar builtins and replace with select and the avx unmasked builtins.
llvm-svn: 289338
2016-12-10 20:27:39 +00:00
Craig Topper 6aefe00ccf [X86] Replace valignd/q builtins with appropriate __builtin_shufflevector.
llvm-svn: 287733
2016-11-23 01:47:12 +00:00
Simon Pilgrim 698528d83b [X86][AVX512] Replace lossless i32/u32 to f64 conversion intrinsics with generic IR
Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.

This patch removes the clang builtins and their use in the headers - a future patch will deal with removing the llvm intrinsics.

This is an extension patch to D20528 which dealt with the equivalent sse/avx cases.

Differential Revision: https://reviews.llvm.org/D26686

llvm-svn: 287088
2016-11-16 09:27:40 +00:00
Craig Topper 5e0709d60b [AVX-512] Replace masked dword and qword variable shift builtins with unmasked builtins and a select.
This is part of a set of changes to allow InstCombine in the backend to optimize variable shifts without having to know about masking.

llvm-svn: 286757
2016-11-13 07:26:34 +00:00
Craig Topper 1a44193afd [AVX-512] Convert the rest of the masked shift by immediate and by single element builtins over to the newly added unmasked builtins and a select.
This should also fix PR30691 since the new builtins are handled like the legacy builtins in the backend.

llvm-svn: 286714
2016-11-12 07:16:59 +00:00
Craig Topper 08bf53ffda [AVX-512] Remove masked vector insert builtins and replace with native shufflevectors and selects.
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.

llvm-svn: 285667
2016-11-01 05:47:56 +00:00
Craig Topper 350729627a [AVX-512] Use selectd instead of selectps for _mm256_mask_extracti32x4_epi32.
llvm-svn: 285545
2016-10-31 05:49:11 +00:00
Craig Topper 93ffabd28d [AVX-512] Remove masked vector extract builtins and replace with native shufflevectors and selects.
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.

llvm-svn: 285540
2016-10-31 04:30:56 +00:00
Craig Topper 66b2fd1209 [AVX-512] Remove many of the masked 128/256-bit shift builtins and replace them with unmasked builtins and selects.
llvm-svn: 285539
2016-10-31 04:30:51 +00:00
Craig Topper 2eadf1b67e [AVX-512] Remove masked 128/256-bit sqrt builtins and replace them with unmasked builtins and a select.
llvm-svn: 285504
2016-10-29 19:02:10 +00:00
Craig Topper 09e94007be [AVX-512] Remove masked 128/256-bit pmuludq/pmuldq builtins and replace them with unmasked builtins and a select.
llvm-svn: 285503
2016-10-29 19:02:07 +00:00
Craig Topper 160ca8420d [AVX-512] Remove masked 128/256-bit floating point max/min builtins. Use unmasked builtins with select instead.
llvm-svn: 285502
2016-10-29 19:02:03 +00:00
Craig Topper eee7c0520c [AVX-512] Replace masked 128/256-bit byte, word, and dword min/max builtins with selects and the older unmasked builtins.
llvm-svn: 284954
2016-10-23 23:57:30 +00:00
Craig Topper 11dda92405 [AVX-512] Replace masked 128/256-bit vpmovzx/vpmovsx builtins with native IR.
llvm-svn: 284927
2016-10-22 21:24:48 +00:00
Craig Topper 78a9c40326 [AVX-512] Remove builtins for 128/256-bit pabsb/pabsw. We can use a select and the older non-masked versions instead.
llvm-svn: 284924
2016-10-22 21:24:38 +00:00
Craig Topper 2dfab63bb3 [AVX-512] Remove 128-bit and 256-bit masked floating point add/sub/mul/div builtins and replace with native operations.
We can't do the 512-bit ones because they take a rounding mode argument that we can't represent.

llvm-svn: 280635
2016-09-04 18:30:17 +00:00
Craig Topper f43e4a1728 [AVX-512] Remove masked integer mullo builtins and replace with native IR.
llvm-svn: 280597
2016-09-03 19:19:49 +00:00
Craig Topper 0e18976b8d [AVX-512] Remove masked integer add/sub builtins and replace with native IR.
llvm-svn: 280596
2016-09-03 18:29:35 +00:00
Craig Topper 45db56c375 [X86] Add missing __x86_64__ qualifiers on a bunch of intrinsics that assume 64-bit GPRs are available.
Usages of these intrinsics in a 32-bit build results in assertions in the backend.

llvm-svn: 276249
2016-07-21 07:38:39 +00:00
Craig Topper 4d61a3c2d8 [AVX512] Replace masked AND/OR/XOR intrinsics with native code and remove the builtins.
llvm-svn: 275049
2016-07-11 06:14:18 +00:00
Simon Pilgrim f5a8837e1b [X86][AVX512] Converted the VBROADCAST intrinsics to generic IR
llvm-svn: 274544
2016-07-05 12:59:33 +00:00
Michael Zuckerman a72b49efe4 ntrinsics _mm256_permutexvar_epi64 doesn't accept three parameters as specify bellow.
I deleted the extra mask parameter.

__m256i _mm256_permutexvar_epi64 (__m256i idx, __m256i a)
#include "immintrin.h"
Instruction: vpermq
CPUID Flags: AVX512VL + AVX512F
Description
Shuffle 64-bit integers in a across lanes using the corresponding index in idx, and store the results in dst.
Operation
FOR j := 0 to 3
  i := j*64
    id := idx[i+1:i]*64
      dst[i+63:i] := a[id+63:id]
      ENDFOR
      dst[MAX:256] := 0
      dst[MAX:256] := 0
      
(From: Intel intrinsics guide)        

llvm-svn: 274539
2016-07-05 11:30:31 +00:00
Craig Topper 2a383c9273 [X86] Use undefined instead of setzero in shufflevector based intrinsics when the second source is unused. Rewrite immediate extractions in shuffle intrinsics to be in ((c >> x) & y) form instead of ((c & z) >> x). This way only x varies between each use instead of having to vary x and z.
llvm-svn: 274525
2016-07-04 22:18:01 +00:00
Simon Pilgrim 427154db2a [X86][AVX512] Converted the VSHUFPD intrinsics to generic IR
llvm-svn: 274523
2016-07-04 21:30:47 +00:00
Simon Pilgrim 30db811526 [X86][AVX512] Converted the VPERMPD/VPERMQ intrinsics to generic IR
llvm-svn: 274502
2016-07-04 13:34:44 +00:00
Simon Pilgrim 275d721485 [X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR
llvm companion patch imminent

llvm-svn: 274442
2016-07-02 17:16:25 +00:00
Craig Topper b3a4477b13 [X86] Replace 128-bit and 256 masked vpermilps/vpermilpd builtins with native IR.
llvm-svn: 274425
2016-07-02 05:36:43 +00:00
Craig Topper 79f53ca0b5 [AVX512] Replace masked unpack builtins with shufflevector and selects.
llvm-svn: 273533
2016-06-23 06:36:42 +00:00
Craig Topper 08181f795f [AVX512] Fix _mm_setzero_di to not require avx512vl since its used by the avx512dqintrin.h. Also update the avx512dq test to not enable avx512vl feature so we can ensure correct dependencies.
llvm-svn: 273388
2016-06-22 06:36:21 +00:00
Craig Topper 879b0978f4 [AVX512] Move the 128-bit and 256-bit lzcnt intrinsics to avx512vlcdintrin.h where they belong.
llvm-svn: 273249
2016-06-21 06:53:58 +00:00
Craig Topper fc07498e4a [AVX512] Masked pcmpeqd, pcmpeqq, pcmpgtd, and pcmpgtq don't require avx512bw, just avx512vl.
llvm-svn: 272532
2016-06-13 04:15:11 +00:00
Craig Topper 7cc9263ec2 [AVX512] Implement masked and 512-bit pshufd intrinsics directly with __builtin_shufflevector and __builtin_ia32_select.
llvm-svn: 272467
2016-06-11 12:50:19 +00:00
Igor Breger aadb876200 [AVX512] Emit select instruction instead of using x86 specific instrinsics.
This will allow us to remove the x86 instrinics from the backend.

Differential Revision: http://reviews.llvm.org/D21060

llvm-svn: 272141
2016-06-08 13:59:20 +00:00
Craig Topper 406d5cdf7c [AVX512] Remove space in -1 constants. NFC
llvm-svn: 271777
2016-06-04 05:43:37 +00:00
Michael Zuckerman 9e7d0a98fa [Clang][AVX512][INTRINSICS] adding round cvt and fix regular cvtps_ph
Differential Revision: http://reviews.llvm.org/D20870

llvm-svn: 271498
2016-06-02 07:44:08 +00:00