llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	91bbe98757	[X86] Remove masking from dbpsadbw builtins, use select builtin instead. llvm-svn: 334385	2018-06-11 06:18:29 +00:00
Craig Topper	03de166ccd	[X86] Add builtins for pshufd, pshuflw, and pshufhw to enable target feature and immediate range checking. llvm-svn: 334265	2018-06-08 06:13:16 +00:00
Craig Topper	d3623155a2	[X86] Add back builtins for _mm_slli_si128/_mm_srli_si128 and similar intrinsics. We still lower them to native shuffle IR, but we do it in CGBuiltin.cpp now. This allows us to check the target feature and ensure the immediate fits in 8 bits. This also improves our -O0 codegen slightly because we're able to see the zeroinitializer in the shuffle. It looks like it got lost behind a store+load previously. llvm-svn: 334208	2018-06-07 17:28:03 +00:00
Craig Topper	95ed88a1a9	[X86] Avoid passing _mm_undefined* to builtin_shufflevector if we are able to pass the first input a second time. This is more consistent with other usages of builtin_shufflevector. Later optimization passes or codegen will detect the duplicate vector and replace it with undef. Using _mm_undefined just puts a zeroinitializer that still needs to be optimized out later. llvm-svn: 333944	2018-06-04 19:28:09 +00:00
Craig Topper	dff5b311af	[X86] Reduce the number of setzero intrinsics to just the set defined by the Intel Intrinsics Guide. We had quite a few for different element sizes of integers sometimes with strange target features attached to them. We only need a single version for each of _m128i, _m256i, and _m512i with the target feature that first introduced those types. llvm-svn: 333568	2018-05-30 18:02:11 +00:00
Craig Topper	68a272d501	[X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins to a single version without masking. Use select builtins with appropriate operand instead. llvm-svn: 333387	2018-05-29 03:26:38 +00:00
Craig Topper	55b4067350	[X86] Remove mask arguments from permvar builtins/intrinsics. Use a select in IR instead. Someday maybe we'll use selects for all the builtins. llvm-svn: 332825	2018-05-20 23:34:10 +00:00
Craig Topper	25de41cfbc	[X86] Use __builtin_convertvector to replace some of the avx512 truncate builtins. As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend. Differential Revision: https://reviews.llvm.org/D46742 llvm-svn: 332266	2018-05-14 17:50:40 +00:00
Chandler Carruth	16429acacb	[x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsics The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997	2018-04-26 21:46:01 +00:00
Alexander Ivchenko	d96ddccdb4	Lowering x86 adds/addus/subs/subus intrinsics (clang) This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. Patch by tkrupa Differential Revision: https://reviews.llvm.org/D44786 llvm-svn: 330323	2018-04-19 12:15:11 +00:00
Craig Topper	2575454fe9	[X86] Replace 512-bit masked pmaddubsw and pmaddwd intrinsic with unmasked intrinsic and a select. This makes it consistent with the 128/256-bit functions. Someday maybe we'll have all the masking moved to selects. llvm-svn: 329775	2018-04-11 04:55:10 +00:00
Craig Topper	0a70c3c7af	[X86] Remove mask from 512 bit pmulhrsw/pmulhw/pmulhuw builtins. We now use a vselect node in IR around an unmasked builtin. This makes it consistent with the 128 and 256 bit versions. llvm-svn: 325560	2018-02-20 07:28:18 +00:00
Craig Topper	ebb0838f74	[X86] Reverse the operand order of the implementation of the kunpack builtins. The second operand needs to be in the lower bits of the concatenation. This matches llvm 5.0, gcc, and icc behavior. Fixes PR36360. llvm-svn: 324954	2018-02-12 22:38:52 +00:00
Craig Topper	f517f1a516	[X86] Implement old kunpck intrinsics using vector ops on vXi1 instead of integer shift/and/or Summary: kunpck intrinsics were removed in favor of native IR a few months ago. The implementation lowers them as by operation on the integer types passed to the intrinsic and then just shifting, masking, and oring them together. A special X86 DAG combine was added to recognize this patter and turn it into a concat_vector operation. I think it makes more sense to keep the IR implementation closer to vector operations on vXi1. Given that we expect these builtins to be used around other builtins that operate on k-registers which we try to represent in IR with vXi1. InstCombine should be able to get rid of the bitcasts between integers and vXi1 leaving only the vector operations. Reviewers: RKSimon, spatel, zvi, jina.nahias Reviewed By: RKSimon Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D42016 llvm-svn: 322461	2018-01-14 19:23:50 +00:00
Craig Topper	de91dff5d4	[X86] Replace cvt*2mask intrinsics with native IR using 'icmp slt X, zeroinitializer. llvm-svn: 322038	2018-01-08 22:37:56 +00:00
Jina Nahias	eb0829155f	[x86][AVX512] Lowering kunpack intrinsics to LLVM IR This patch, together with a matching llvm patch (https://reviews.llvm.org/D39720), implements the lowering of X86 kunpack intrinsics to IR. Differential Revision: https://reviews.llvm.org/D39719 Change-Id: Id5d3cb394ad33b98be79a6783d1d15569e2b798d llvm-svn: 319777	2017-12-05 15:42:47 +00:00
Uriel Korach	5b2b71d909	[X86] test/testn intrinsics lowering to IR. clang side Change Header files of the intrinsics for lowering test and testn intrinsics to IR code. Removed test and testn builtins from clang Differential Revision: https://reviews.llvm.org/D38737 llvm-svn: 318035	2017-11-13 12:50:52 +00:00
Jina Nahias	3ad702a1ed	Lowering Mask Set1 intrinsics to LLVM IR This patch, together with a matching llvm patch (https://reviews.llvm.org/D37669), implements the lowering of X86 mask set1 intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37668 llvm-svn: 313624	2017-09-19 11:00:27 +00:00
Uriel Korach	3fba3c3b0c	[X86] [PATCH] [intrinsics] Lowering X86 ABS intrinsics to IR. (clang) This patch, together with a matching llvm patch (https://reviews.llvm.org/D37693), implements the lowering of X86 ABS intrinsics to IR. Differential Revision: https://reviews.llvm.org/D37694 llvm-svn: 313133	2017-09-13 09:02:02 +00:00
Yael Tsafrir	23e7733230	[X86] Lower _mm[256\|512]_[mask[z]]_avg_epu[8\|16] intrinsics to native llvm IR Differential Revision: https://reviews.llvm.org/D37562 llvm-svn: 313011	2017-09-12 07:46:32 +00:00
Michael Zuckerman	13bcf4944a	Fix problem with test. llvm-svn: 299442	2017-04-04 15:44:06 +00:00
Michael Zuckerman	755a13db3d	[X86][Clang] Converting __mm{\|256\|512}_movm_epi{8\|16\|32\|64} LLVMIR call into generic intrinsics. This patch is a part two of two reviews, one for the clang and the other for LLVM. In this patch, I covered the clang side, by introducing the intrinsic to the front end. This is done by creating a generic replacement. Differential Revision: https://reviews.llvm.org/D31394a llvm-svn: 299431	2017-04-04 13:29:53 +00:00
Craig Topper	208c80556c	[AVX-512] Fix test cases that were using the builtins directly without typecasts instead of the intrinsic header. llvm-svn: 298041	2017-03-17 05:59:22 +00:00
Sanjay Patel	e795daa55e	[x86] these aren't the undefs you're looking for (PR32176) x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. This is not the same as LLVM's undef value which can take on multiple bit patterns. There are better solutions / follow-ups to this discussed here: https://bugs.llvm.org/show_bug.cgi?id=32176 ...but this should prevent miscompiles with a one-line code change. Differential Revision: https://reviews.llvm.org/D30834 llvm-svn: 297588	2017-03-12 19:15:10 +00:00
Craig Topper	f0d1147fae	[AVX-512] Replace 512-bit masked packss/packus builtins and replace with new unmasked builtins. These new unmasked builtins will enable us to easily support optimizing these builtins in InstCombine in the backend. llvm-svn: 295291	2017-02-16 06:32:07 +00:00
Craig Topper	cdd3603c04	[AVX-512] Remove masking from 512-bit pshufb builtin. The backend now has a version without masking so wrap it with select. This will allow the backend to constant fold these to generic shuffle vectors like 128-bit and 256-bit without having to working about handling masking. llvm-svn: 289345	2016-12-10 23:09:52 +00:00
Craig Topper	37bf5c6a3f	[AVX-512] Replace masked 16-bit element variable shift builtins with new unmasked versions and selects. llvm-svn: 287313	2016-11-18 05:04:51 +00:00
Craig Topper	1a44193afd	[AVX-512] Convert the rest of the masked shift by immediate and by single element builtins over to the newly added unmasked builtins and a select. This should also fix PR30691 since the new builtins are handled like the legacy builtins in the backend. llvm-svn: 286714	2016-11-12 07:16:59 +00:00
Craig Topper	531ce28311	[AVX-512] Replace 64-bit element and 512-bit vector pmin/pmax builtins with native IR like we do for 128/256-bit, but with the addition of masking. llvm-svn: 284956	2016-10-24 04:04:24 +00:00
Craig Topper	0c5da26572	[AVX-512] Replace 512-bit pmovzx/sx builtins with native IR. llvm-svn: 284936	2016-10-23 07:35:47 +00:00
Elad Cohen	b107a22afb	[X86] Remove the mm_malloc.h include guard hack from the X86 builtins tests The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted environment since it expects stdlib.h to be available - which is not the case in these internal clang codegen tests). This patch removes this hack and instead passes -ffreestanding to clang cc1. Differential Revision: https://reviews.llvm.org/D24825 llvm-svn: 282581	2016-09-28 11:59:09 +00:00
Craig Topper	f43e4a1728	[AVX-512] Remove masked integer mullo builtins and replace with native IR. llvm-svn: 280597	2016-09-03 19:19:49 +00:00
Craig Topper	0e18976b8d	[AVX-512] Remove masked integer add/sub builtins and replace with native IR. llvm-svn: 280596	2016-09-03 18:29:35 +00:00
Eric Christopher	abb2b54ad3	After PR28761 use -Wall with -Werror in builtins tests to identify possible problems in headers. llvm-svn: 277696	2016-08-04 06:02:50 +00:00
Simon Pilgrim	f5a8837e1b	[X86][AVX512] Converted the VBROADCAST intrinsics to generic IR llvm-svn: 274544	2016-07-05 12:59:33 +00:00
Michael Zuckerman	7dac6fbdf8	[Clang][BuiltIn][AVX512] adding _mm{\|256\|512}_mask_cvt{s\|us\|}epi16_storeu_epi8 intrinsics Differential Revision: http://reviews.llvm.org/D21729 llvm-svn: 274532	2016-07-05 08:08:01 +00:00
Artur Pilipenko	70d4bb566c	Update the expected masked load/store intrinsics names in tests The mangling of their names was changed in order to support arbitrary addrspace pointers as arguments in rL274043. llvm-svn: 274044	2016-06-28 18:28:45 +00:00
Craig Topper	79f53ca0b5	[AVX512] Replace masked unpack builtins with shufflevector and selects. llvm-svn: 273533	2016-06-23 06:36:42 +00:00
Craig Topper	d1691c7026	[AVX512] Replace masked integer cmp and ucmp builtins with native IR. llvm-svn: 273378	2016-06-22 04:47:58 +00:00
Craig Topper	a54c21e742	[AVX512] Use native IR for mask pcmpeq/pcmpgt intrinsics. llvm-svn: 272787	2016-06-15 14:06:34 +00:00
Craig Topper	68738332b8	[AVX512] Implement 512-bit and masked shufflelo and shufflehi intrinsics directly with __builtin_shufflevector and __builtin_ia32_select. Also improve the formatting of the AVX2 version. llvm-svn: 272452	2016-06-11 03:31:13 +00:00
Craig Topper	d4273a425e	[AVX512] Add _mm512_bsrli_epi128 and _mm512_bslli_epi128 intrinsics. llvm-svn: 272451	2016-06-11 03:31:07 +00:00
Igor Breger	aadb876200	[AVX512] Emit select instruction instead of using x86 specific instrinsics. This will allow us to remove the x86 instrinics from the backend. Differential Revision: http://reviews.llvm.org/D21060 llvm-svn: 272141	2016-06-08 13:59:20 +00:00
Craig Topper	f51cc07719	[AVX512] Convert masked palignr builtins directly to native IR similar to the other palignr builtins, but with a select to handle masking. llvm-svn: 271873	2016-06-06 06:13:01 +00:00
Craig Topper	4b060e31c9	[AVX512] Convert masked load builtins to generic masked load intrinsics instead of the x86 specific ones. This will allow the x86 intrinsics to be removed from the backend. llvm-svn: 271253	2016-05-31 06:58:07 +00:00
Craig Topper	6e891fbdd2	[AVX512] Emit generic masked store instrinsics instead of using x86 specific intrinsics. This will allow us to remove the x86 instrinics from the backend. llvm-svn: 271246	2016-05-31 01:50:10 +00:00
Michael Zuckerman	e871785eb6	[Clang][avx512][Builtin] Adding intrinsics for cvtw2mask{128\|256\|512} instruction set Differential Revision: http://reviews.llvm.org/D19766 llvm-svn: 268385	2016-05-03 14:12:23 +00:00
Michael Zuckerman	de8d3753d3	[clang][AVX512][Builtin] Adding intrinsics for the SAD instruction set. Differential Revision: http://reviews.llvm.org/D19591 llvm-svn: 267942	2016-04-28 21:21:08 +00:00
Michael Zuckerman	533e065bdc	[Clang][BuiltIn][AVX512] Adding intrinsics fot align{d\|q} and palignr instruction set Differential Revision: http://reviews.llvm.org/D19588 llvm-svn: 267876	2016-04-28 12:47:30 +00:00
Michael Zuckerman	8938e836c4	[Clang][AVX512][BuiltIn] Adding support to intrinsics of VPERMD and VPERMW instruction set Differential Revision: http://reviews.llvm.org/D19195 llvm-svn: 267380	2016-04-25 05:32:35 +00:00

1 2

75 Commits