Commit Graph

48 Commits

Author SHA1 Message Date
Craig Topper a2137d6e09 [X86] Add -flax-vector-conversions=none to all of the x86 vector intrinsic header tests. 2020-01-23 20:43:50 -08:00
Richard Smith 9864269a0d Fix reliance on lax vector conversions in tests for x86 intrinsics.
llvm-svn: 372062
2019-09-17 03:56:28 +00:00
JF Bastien dbc0a5df8d Allow prefetching from non-zero address spaces
Summary:
This is useful for targets which have prefetch instructions for non-default address spaces.

<rdar://problem/42662136>

Subscribers: nemanjai, javed.absar, hiraditya, kbarton, jkorous, dexonsmith, cfe-commits, llvm-commits, RKSimon, hfinkel, t.p.northover, craig.topper, anemet

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65254

llvm-svn: 367032
2019-07-25 16:11:57 +00:00
Craig Topper caf6b71ab2 [X86] Change the IR sequence for _mm_storeh_pi and _mm_storel_pi to perform the store as a <2 x float> instead of i64.
This is similar to what we do for loadl_pi and loadh_pi.

llvm-svn: 365669
2019-07-10 17:11:29 +00:00
Craig Topper f9cb127ca9 [X86] Add guards to some of the x86 intrinsic tests to skip 64-bit mode only intrinsics when compiled for 32-bit mode.
All the command lines are for 64-bit mode, but sometimes I compile
the tests in 32-bit mode to see what assembly we get and we need
to skip these to do that.

llvm-svn: 365668
2019-07-10 17:11:23 +00:00
Reid Kleckner 79d7f4114d [X86] Use __m128_u for _mm_loadu_ps after r353555
Add secondary triple to existing SSE test for it.  I audited other uses
of __attribute__((__packed__)) in the intrinsic headers, and this seemed
to be the only missing one.

llvm-svn: 353878
2019-02-12 21:04:21 +00:00
Craig Topper 638426fc36 [X86] Add __builtin_ia32_selectss_128 and __builtin_ia32_selectsd_128 that is suitable for use in scalar mask intrinsics.
This will convert the i8 mask argument to <8 x i1> and extract an i1 and then emit a select instruction. This replaces the '(__U & 1)" and ternary operator used in some of intrinsics. The old sequence was lowered to a scalar and and compare. The new sequence uses an i1 vector that will interoperate better with other mask intrinsics.

This removes the need to handle div_ss/sd specially in CGBuiltin.cpp. A follow up patch will add the GCCBuiltin name back in llvm and remove the custom handling.

I made some adjustments to legacy move_ss/sd intrinsics which we reused here to do a simpler extract and insert instead of 2 extracts and two inserts or a shuffle.

llvm-svn: 336622
2018-07-10 00:37:25 +00:00
Tomasz Krupa f1792bb3d6 [X86] Lowering sqrt intrinsics to native IR
Reviewers: craig.topper, spatel, RKSimon, igorb, uriel.k

Reviewed By: craig.topper

Subscribers: tkrupa, cfe-commits

Differential Revision: https://reviews.llvm.org/D41168

llvm-svn: 334850
2018-06-15 18:05:59 +00:00
Craig Topper c5ec55e921 [X86] Simplify the implementation of _mm_sqrt_ss, _mm_rcp_ss, and _mm_rsqrt_ss.
We don't need the insertion back into the original vector at the end. The builtin already understands that.

This is different than _mm_sqrt_sd which takes two arguments and we do need to insert.

llvm-svn: 333572
2018-05-30 18:27:07 +00:00
Gabor Buella 5219ed89be [X86] NFC Include immintrin.h in CodeGen tests
Following r333110:
"Move all Intel defined intrinsic includes into immintrin.h"

llvm-svn: 333160
2018-05-24 07:09:08 +00:00
Sanjay Patel e795daa55e [x86] these aren't the undefs you're looking for (PR32176)
x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. 
This is not the same as LLVM's undef value which can take on multiple bit patterns.

There are better solutions / follow-ups to this discussed here:
https://bugs.llvm.org/show_bug.cgi?id=32176
...but this should prevent miscompiles with a one-line code change.

Differential Revision: https://reviews.llvm.org/D30834

llvm-svn: 297588
2017-03-12 19:15:10 +00:00
Elad Cohen b107a22afb [X86] Remove the mm_malloc.h include guard hack from the X86 builtins tests
The X86 clang/test/CodeGen/*builtins.c tests define the mm_malloc.h include
guard as a hack for avoiding its inclusion (mm_malloc.h requires a hosted
environment since it expects stdlib.h to be available - which is not the case
in these internal clang codegen tests).
This patch removes this hack and instead passes -ffreestanding to clang cc1.

Differential Revision: https://reviews.llvm.org/D24825

llvm-svn: 282581
2016-09-28 11:59:09 +00:00
Eric Christopher abb2b54ad3 After PR28761 use -Wall with -Werror in builtins tests to identify
possible problems in headers.

llvm-svn: 277696
2016-08-04 06:02:50 +00:00
Simon Pilgrim e3b9ee0645 [X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.

It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).

This patch changes both scalar and packed versions back to using x86-specific builtins.

It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.

Differential Revision: https://reviews.llvm.org/D22105

llvm-svn: 276102
2016-07-20 10:18:01 +00:00
Sanjay Patel 280cfd1a69 [x86] translate SSE packed FP comparison builtins to IR
As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as 
expected. Eg:

  typedef float v4sf __attribute__((__vector_size__(16)));
  v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }

Differential Revision: http://reviews.llvm.org/D21268

llvm-svn: 272840
2016-06-15 21:20:04 +00:00
Simon Pilgrim 0e90936fea [X86] Ensure load/store tests unaligned pointers really are align 1
llvm-svn: 271227
2016-05-30 19:20:55 +00:00
Simon Pilgrim 43439bd33d [X86][SSE] Added missing tests (merge failure)
Differential Revision: http://reviews.llvm.org/D20617

llvm-svn: 271219
2016-05-30 17:58:38 +00:00
Craig Topper 09175dab31 [X86] Replace unaligned store builtins in SSE/AVX intrinsic files with code that will compile to a native unaligned store. Remove the builtins since they are no longer used.
Intrinsics will be removed from llvm in a future commit.

llvm-svn: 271214
2016-05-30 17:10:30 +00:00
Simon Pilgrim 7b365bce6f [X86][SSE] Updated _mm_store_ps1 test to match _mm_store1_ps
llvm-svn: 270679
2016-05-25 09:20:08 +00:00
Craig Topper f70a61ff3f [X86] Update test cases to make sure storeu builtins use the storeu instrinsics. We were previously matching on other stores in the IR from this being an -O0 test.
We should probably look into making the storeu builtins just emit a normal store with an alignment of 1.

llvm-svn: 270664
2016-05-25 05:26:23 +00:00
Simon Pilgrim 9b3729b043 [X86][SSE] Sync with llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll
sse-builtins.c now just covers SSE1 intrinsics

llvm-svn: 270083
2016-05-19 17:11:31 +00:00
Simon Pilgrim 2d1decf7cb [X86][SSE] Tidied up MMX/SSE/SSE2 builtin tests to the correct test file
llvm-svn: 269852
2016-05-17 22:03:31 +00:00
Craig Topper 8ca5373c72 [X86] Fix a few intrinsic tests to use the return type that matches the intrinsic they're testing.
llvm-svn: 269735
2016-05-17 03:42:37 +00:00
Craig Topper fb79b5f273 [X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause.
llvm-svn: 252712
2015-11-11 08:13:33 +00:00
Craig Topper a5455524c2 [X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there.
llvm-svn: 252711
2015-11-11 08:00:41 +00:00
Simon Pilgrim 0d9d748bf1 [X86][SSSE3] Added SSSE3 IR + assembly codegen builtin tests
Transferred SSSE3 instructions from sse-builtins.c

llvm-svn: 246948
2015-09-06 17:06:22 +00:00
Simon Pilgrim ff88a0da31 [X86]][SSE3] Added SSE41 IR + assembly codegen builtin tests
Transferred SSE41 instructions from sse-builtins.c

llvm-svn: 246947
2015-09-06 16:38:17 +00:00
Simon Pilgrim 5aba9925c0 [X86][SSE] Add _mm_undefined_* intrinsics
Added missing SSE/AVX 'undefined' intrinsics (PR24040):

_mm_undefined_pd, _mm_undefined_ps + _mm_undefined_si128
_mm256_undefined_pd, _mm256_undefined_ps + _mm256_undefined_si256
_mm512_undefined, _mm512_undefined_ps, _mm512_undefined_pd + _mm512_undefined_epi32

Added builtin intrinsicss:

__builtin_ia32_undef128, __builtin_ia32_undef256 + __builtin_ia32_undef512

Differential Revision: http://reviews.llvm.org/D12052

llvm-svn: 246083
2015-08-26 21:17:12 +00:00
Simon Pilgrim 503976ad9a Added missing tests for SSE41 pmovsx/pmovzx extension intrinsics
llvm-svn: 245815
2015-08-23 16:19:38 +00:00
David Blaikie a953f2825b Update Clang tests to handle explicitly typed load changes in LLVM.
llvm-svn: 230795
2015-02-27 21:19:58 +00:00
Manuel Klimek fa27b8861b Make tests independent of llvm variable naming.
llvm-svn: 229484
2015-02-17 09:49:31 +00:00
Craig Topper 96f9a573b5 [X86] Convert palignr builtin handling to use shuffle form of right shift instead of intrinsics. This should allow the instrinsics to removed from the backend.
llvm-svn: 229474
2015-02-17 07:18:01 +00:00
Craig Topper 480e2b6e43 [X86] Merge the 2 separate builtin handlers for PALIGNR into a single one that handles both.
llvm-svn: 229469
2015-02-17 06:37:58 +00:00
Craig Topper 4fb4581716 [X86] Fix test cases that I foolishly copied and modified from another file that had optimizations on. This caused the check patterns to not quite match.
llvm-svn: 229073
2015-02-13 06:27:39 +00:00
Craig Topper a462482d98 [X86] Add _mm_bslli_si128 and _mm_bsrli_si128 as aliases of _mm_slli_si128 and _mm_srli_si128. This matches Intel documentation and gcc.
llvm-svn: 229066
2015-02-13 06:04:45 +00:00
Craig Topper 2094d8fe88 [x86] Add the (v)cmpps/pd/ss/sd builtins to match gcc. Use them in the sse intrinsic files.
This still lower to the same intrinsics as before.

This is preparation for bounds checking the immediate on the avx version of the builtin so we don't pass illegal immediates into the backend. Since SSE uses a smaller size immediate its not possible to bounds check when using a shared builtin. Rather than creating a clang specific builtin for the different immediate, I decided (after consulting with Chandler) that it was better to match gcc.

llvm-svn: 224879
2014-12-27 06:59:57 +00:00
Evgeniy Stepanov 2be29929be Fix line numbers for code inlined from __nodebug__ functions.
Instructions from __nodebug__ functions don't have file:line
information even when inlined into no-nodebug functions. As a result,
intrinsics (SSE and other) from <*intrin.h> clang headers _never_
have file:line information.

With this change, an instruction without !dbg metadata gets one from
the call instruction when inlined.

Fixes PR19001.

llvm-svn: 210459
2014-06-09 09:09:19 +00:00
Filipe Cabecinhas 5d289b48b1 Patched clang to emit x86 blends as shufflevectors.
Summary:
Most of the clang header patch by Simon Pilgrim @ SCEE.
Also fixed (or added) clang tests for these intrinsics.

LLVM tests to make sure we get the blend instruction out of these
shufflevectors are at http://reviews.llvm.org/D3600

Reviewers: eli.friedman, craig.topper, rafael

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D3601

llvm-svn: 208664
2014-05-13 02:37:02 +00:00
Manman Ren c94122e05b Intrinsics: fix extract & insert when index is out of bound.
Now, all extract & insert intrinsics should have the correct and operation
to ignore higher bits.

rdar://15250497

llvm-svn: 193267
2013-10-23 20:33:14 +00:00
Manman Ren be38b9e15f _mm_extract_epi16: use "& 7" when index is out of bound.
This is in line with implementation of _mm_extract_pi16.
rdar://15250497

llvm-svn: 193187
2013-10-22 19:24:42 +00:00
Eli Friedman f9d8c6cebb Add _mm_stream_si64 intrinsic.
While I'm here, also fix the alignment computation for the whole family of
intrinsics.

PR17298.

llvm-svn: 191243
2013-09-23 23:38:39 +00:00
Stephen Lin 4362261b00 CHECK-LABEL-ify some code gen tests to improve diagnostic experience when tests fail.
llvm-svn: 188447
2013-08-15 06:47:53 +00:00
Manman Ren 5750c1c07e X86 SSE Intrinsics: update header for sqrt_ss, rsqrt_ss and rcp_ss.
There intrinsics pass through the upper FP values from the input.
rdar://12558838

llvm-svn: 166743
2012-10-26 00:25:10 +00:00
Chad Rosier 87622b8b84 Get rid of storelv4si builtin as it can be expressed directly. This is general
goodness because it provides opportunites to cleanup things.  For example,

uint64_t t1(__m128i vA)
{
  uint64_t Alo;
  _mm_storel_epi64((__m128i*)&Alo, vA);
  return Alo;
}

was generating 

	movq	%xmm0, -8(%rbp)
	movq	-8(%rbp), %rax

and now generates

	movd	%xmm0, %rax

rdar://11282581

llvm-svn: 155924
2012-05-01 18:11:51 +00:00
Craig Topper 74c17c65e4 Correctly check argument types for some vector macros in smmintrin.h. Put parentheses around uses of vector macro arguments.
llvm-svn: 153732
2012-03-30 07:01:17 +00:00
Craig Topper 97f042f2d6 Add _mm_minpos_epu16 to smmintrin.h. Fixes PR12399.
llvm-svn: 153726
2012-03-30 05:41:28 +00:00
NAKAMURA Takumi 3f62af76b9 test/CodeGen/sse-builtins.c: Make this host-independent to unbreak posix-unlike hosts.
Without -ffreestanding, clang tries to seek /usr/include/stdlib.h in host filesystem, even on Windows hosts.

llvm-svn: 139899
2011-09-16 03:55:36 +00:00
Eli Friedman 9bb51adcce Tweak *mmintrin.h so that they don't make any bad assumptions about alignment (which probably has little effect in practice, but better to get it right). Make the load in _mm_loadh_pi and _mm_loadl_pi a single LLVM IR instruction to make optimizing easier for CodeGen.
rdar://10054986

llvm-svn: 139874
2011-09-15 23:15:27 +00:00