Commit Graph

39 Commits

Author SHA1 Message Date
Craig Topper d619eaaae4 [X86] Add missing typecasts in intrinsic macros. This should make them more robust against inputs that aren't already the right type.
llvm-svn: 252700
2015-11-11 03:47:10 +00:00
Craig Topper 7148166785 [X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC
llvm-svn: 252586
2015-11-10 05:08:05 +00:00
Ahmed Bougacha 7dfaaf3891 [Headers][X86] Fix stream_load (movntdqa) to accept const*.
Per Intel intrinsics guide:
- _mm256_stream_load_si256 takes `__m256i const *'
- _mm_stream_load_si128 takes `__m128i *', for no good reason.

Let's accept const* for both.

llvm-svn: 249213
2015-10-02 23:29:26 +00:00
Chandler Carruth cbe6411401 Fix the SSE4 byte sign extension in a cleaner way, and more thoroughly
test that our intrinsics behave the same under -fsigned-char and
-funsigned-char.

This further testing uncovered that AVX-2 has a broken cmpgt for 8-bit
elements, and has for a long time. This is fixed in the same way as
SSE4 handles the case.

The other ISA extensions currently work correctly because they use
specific instruction intrinsics. As soon as they are rewritten in terms
of generic IR, they will need to add these special casts. I've added the
necessary testing to catch this however, so we shouldn't have to chase
it down again.

I considered changing the core typedef to be signed, but that seems like
a bad idea. Notably, it would be an ABI break if anyone is reaching into
the innards of the intrinsic headers and passing __v16qi on an API
boundary. I can't be completely confident that this wouldn't happen due
to a macro expanding in a lambda, etc., so it seems much better to leave
it alone. It also matches GCC's behavior exactly.

A fun side note is that for both GCC and Clang, -funsigned-char really
does change the semantics of __v16qi. To observe this, consider:

  % cat x.cc
  #include <smmintrin.h>
  #include <iostream>

  int main() {
    __v16qi a = { 1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
    __v16qi b = _mm_set1_epi8(-1);
    std::cout << (int)(a / b)[0] << ", " << (int)(a / b)[1] << '\n';
  }
  % clang++ -o x x.cc && ./x
  -1, 1
  % clang++ -funsigned-char -o x x.cc && ./x
  0, 1

However, while this may be surprising, both Clang and GCC agree.

Differential Revision: http://reviews.llvm.org/D13324

llvm-svn: 249097
2015-10-01 23:40:12 +00:00
Chandler Carruth 9143378db0 Patch over a really horrible bug in our vector builtins that showed up
recently when we started using direct conversion to model sign
extension. The __v16qi type we use for SSE v16i8 vectors is defined in
terms of 'char' which may or may not be signed! This causes us to
generate pmovsx and pmovzx depending on the setting of -funsigned-char.

This patch just forms an explicitly signed type and uses that to
formulate the sign extension. While this gets the correct behavior
(which we now verify with the enhanced test) this is just the tip of the
ice berg. Now that I know what to look for, I have found errors of this
sort *throughout* our vector code. Fortunately, this is the only
specific place where I know of users actively having their code
miscompiled by Clang due to this, so I'm keeping the fix for those users
minimal and targeted.

I'll be sending a proper email for discussion of how to fix these
systematically, what the implications are, and just how widely broken
this is... From what I can tell, we have never shipped a correct set of
builtin headers for x86 when users rely on -funsigned-char. Oops.

llvm-svn: 248980
2015-10-01 02:21:34 +00:00
Simon Pilgrim 12919f7e49 [X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR
128-bit vector integer sign extensions correctly lower to the pmovsx instructions even for debug builds.

This patch removes the builtins and reimplements the _mm_cvtepi*_epi* intrinsics __using builtin_shufflevector (to extract the bottom most subvector) and __builtin_convertvector (to actually perform the sign extension).

Differential Revision: http://reviews.llvm.org/D12835

llvm-svn: 248092
2015-09-19 15:12:38 +00:00
Sean Silva e4c3760a9f Clean up trailing whitespace in the builtin headers
llvm-svn: 247498
2015-09-12 02:55:19 +00:00
Michael Kuperstein e45af54cdb [X86] Rename DEFAULT_FN_ATTR macro to __DEFAULT_FN_ATTR
llvm-svn: 241065
2015-06-30 13:36:19 +00:00
Eric Christopher 9fc7fb274e Update the intel intrinsic headers to use the target attribute support.
This involved removing the conditional inclusion and replacing them
with target attributes matching the original conditional inclusion
and checks. The testcase update removes the macro checks for each
file and replaces them with usage of the __target__ attribute, e.g.:

int __attribute__((__target__(("sse3")))) foo(int a) {
  _mm_mwait(0, 0);
  return 4;
}

This usage does require the enclosing function have the requisite
__target__ attribute for inlining and code generation - also for
any macro intrinsic uses in the enclosing function. There's no change
for existing uses of the intrinsic headers.

llvm-svn: 239883
2015-06-17 07:09:32 +00:00
Eric Christopher 4d185168e9 Use a define for per-file function attributes for the Intel intrinsic headers.
This is a precursor to changing them to use the new target attribute
code.

llvm-svn: 239882
2015-06-17 07:09:20 +00:00
Filipe Cabecinhas 5d289b48b1 Patched clang to emit x86 blends as shufflevectors.
Summary:
Most of the clang header patch by Simon Pilgrim @ SCEE.
Also fixed (or added) clang tests for these intrinsics.

LLVM tests to make sure we get the blend instruction out of these
shufflevectors are at http://reviews.llvm.org/D3600

Reviewers: eli.friedman, craig.topper, rafael

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D3601

llvm-svn: 208664
2014-05-13 02:37:02 +00:00
Manman Ren c94122e05b Intrinsics: fix extract & insert when index is out of bound.
Now, all extract & insert intrinsics should have the correct and operation
to ignore higher bits.

rdar://15250497

llvm-svn: 193267
2013-10-23 20:33:14 +00:00
Eli Friedman 9b04f41899 Fix return type of _mm_extract_epi8 etc.
PR17300.

llvm-svn: 191120
2013-09-21 00:05:25 +00:00
David Blaikie 3302f2bd46 PR14964: intrinsic headers using non-reserved identifiers
Several of the intrinsic headers were using plain non-reserved identifiers.
C++11 17.6.4.3.2 [global.names] p1 reservers names containing a double
begining with an underscore followed by an uppercase letter for any use.

I think I got them all, but open to being corrected. For the most part I
didn't bother updating function-like macro parameter names because I don't
believe they're subject to any such collission - though some function-like
macros already follow this convention (I didn't update them in part because
the churn was more significant as several function-like macros use the double
underscore prefixed version of the same name as a parameter in their
implementation)

llvm-svn: 172666
2013-01-16 23:08:36 +00:00
Craig Topper 74c17c65e4 Correctly check argument types for some vector macros in smmintrin.h. Put parentheses around uses of vector macro arguments.
llvm-svn: 153732
2012-03-30 07:01:17 +00:00
Craig Topper 97f042f2d6 Add _mm_minpos_epu16 to smmintrin.h. Fixes PR12399.
llvm-svn: 153726
2012-03-30 05:41:28 +00:00
Craig Topper 1de8348db7 Add popcnt feature flag to match gcc. This flag is implied when sse42 is enabled, but can be disabled separately. Move popcnt intrinsics to popcntintrin.h to match gcc.
llvm-svn: 147340
2011-12-29 16:10:46 +00:00
Craig Topper a89747dd1e Add AVX2 intrinsics for pavg, pblend, and pcmp instructions. Also remove unneeded builtins for SSE pcmp. Change SSE pcmpeqq and pcmpgtq to not use builtins and just use vector == and >.
llvm-svn: 146969
2011-12-20 09:55:26 +00:00
Bob Wilson 16c4195548 Fix obvious error in _mm_test_all_zeros. PR11565.
Patch by Mathias Gaunard!

llvm-svn: 146565
2011-12-14 17:17:16 +00:00
Chandler Carruth 222c66db38 Fix a blatant typo or cut/paste-o reported by users of this header.
llvm-svn: 146251
2011-12-09 09:23:55 +00:00
Eli Friedman f16beb3942 Fix some additional x86 intrinsics to use "I" (ICE) markings. Fix *mmintrin.h to take them into account.
<rdar://problem/10341145>

llvm-svn: 144246
2011-11-10 00:11:13 +00:00
Eli Friedman 9586cdb01e Misc fixes to pcmp*stri.
llvm-svn: 144073
2011-11-08 04:13:51 +00:00
Eric Christopher 2a9898f0a2 Move some type defines from smmintrin.h to emmintrin.h to match where
gcc defines them.

llvm-svn: 112146
2010-08-26 02:09:25 +00:00
Chris Lattner 9052c35479 fix some vector extractions to return properly zero extended values
(instead of sign extending) to match ICC.  GCC is changing this in 
a series of their own PRs (e.g. 41323).

llvm-svn: 111637
2010-08-20 16:08:33 +00:00
Daniel Dunbar f5e075d392 Headers: Fix quoting of macro arguments in a couple more places.
llvm-svn: 105331
2010-06-02 16:35:01 +00:00
Eric Christopher bd9a3aecd6 This is just a simple v4si * v4si, make it so.
llvm-svn: 99587
2010-03-26 00:51:28 +00:00
Anders Carlsson 91e18c93c4 Make the license header in smmintrin.h match the other SSE headers.
llvm-svn: 99384
2010-03-24 05:31:31 +00:00
Chris Lattner 7eac805bb0 fix PR6658: inline isn't a keyword in C89 mode, use __inline__ instead.
llvm-svn: 99190
2010-03-22 18:14:12 +00:00
Eric Christopher 08f135274d Add sse4.2 header and builtin support.
llvm-svn: 99051
2010-03-20 07:43:28 +00:00
Eric Christopher 8c6f61394f Add remaining sse4.1 intrinsics and builtins.
llvm-svn: 98587
2010-03-15 23:22:58 +00:00
Eric Christopher 6932b2e8b7 Add SSE4 packed integer comparisons and corresponding intrinsics.
llvm-svn: 98323
2010-03-12 01:22:33 +00:00
Eric Christopher e486f68b59 Integer array extraction for sse4.1.
llvm-svn: 98305
2010-03-11 23:50:18 +00:00
Eric Christopher e7594305bc Add packed integer array insertion.
llvm-svn: 98299
2010-03-11 23:36:29 +00:00
Eric Christopher 1dca62055a Add insert/extract_ps and related random macros.
llvm-svn: 98114
2010-03-10 00:50:58 +00:00
Eric Christopher 4c70358296 Add sse4.1 packed min and max intrinsics.
llvm-svn: 97907
2010-03-07 07:00:42 +00:00
Eric Christopher 7288890b51 Add load hint instruction intrinsic.
llvm-svn: 97904
2010-03-07 06:29:09 +00:00
Eric Christopher 87990fe5df Add in support for dword multiply and fp dot product intrinsics.
llvm-svn: 97902
2010-03-07 06:17:19 +00:00
Eric Christopher b0759be4d0 Fix _MM_FROUND_NEARBYINT and move rounding intrinsics to macros.
llvm-svn: 97874
2010-03-06 10:31:44 +00:00
Eric Christopher 94567c04bb First start on smmintrin.h, rounding and blending.
llvm-svn: 97717
2010-03-04 02:56:19 +00:00