Commit Graph

3654 Commits

Author SHA1 Message Date
Michael Zuckerman 22c47e606a Adding missing _mm512_castsi512_si256 intrinsic.
llvm-svn: 270851
2016-05-26 14:32:11 +00:00
Simon Pilgrim 1fdfbf6941 [X86][F16C] Improved f16c intrinsics checks
Added checks for upper elements being zero'd in scalar conversions

llvm-svn: 270836
2016-05-26 10:20:25 +00:00
Simon Pilgrim 57446efaa9 [X86][AVX2] Improved checks for float/double mask generation for non-masked gathers
llvm-svn: 270833
2016-05-26 09:56:50 +00:00
Michael Zuckerman eb5f178c4b Fix instrinsics names:
_mm128_cmp_ps_mask-->_mm_cmp_ps_mask
_mm128_mask_cmp_ps_mask-->_mm_mask_cmp_ps_mask
_mm128_cmp_pd_mask-->_mm_cmp_pd_mask
_mm128_mask_cmp_pd_mask-->_mm_mask_cmp_pd_mask

llvm-svn: 270830
2016-05-26 08:10:12 +00:00
Michael Zuckerman 6f08cebf36 [Clang][AVX512][BUILTIN] Adding intrinsics for set1
Differential Revision: http://reviews.llvm.org/D20562

llvm-svn: 270825
2016-05-26 06:54:52 +00:00
Simon Pilgrim f1ad90d509 [X86][AVX2] Full set of AVX2 intrinsics tests
llvm/test/CodeGen/X86/avx2-intrinsics-fast-isel.ll will be synced to this

llvm-svn: 270708
2016-05-25 15:10:49 +00:00
Benjamin Kramer 1f4381f810 [AVX512] Don't rely on value names. They're different in release builds.
llvm-svn: 270704
2016-05-25 14:30:01 +00:00
Michael Zuckerman d5cc6cd262 [Clang][AVX512][BUILTIN] Add missing intrinsics for cast
Differential Revision: http://reviews.llvm.org/D20523

llvm-svn: 270699
2016-05-25 14:04:21 +00:00
Denis Zobnin eebc4af0ed [ms][dll] #26935 Defining a dllimport function should cause it to be exported
If we have some function with dllimport attribute and then we have the function
definition in the same module but without dllimport attribute we should add
dllexport attribute to this function definition.
The same should be done for variables.

Example:
struct __declspec(dllimport) C3 {
  ~C3();
};
C3::~C3() {;} // we should export this definition.

Patch by Andrew V. Tischenko

Differential revision: http://reviews.llvm.org/D18953

llvm-svn: 270686
2016-05-25 11:32:42 +00:00
Simon Pilgrim 7b365bce6f [X86][SSE] Updated _mm_store_ps1 test to match _mm_store1_ps
llvm-svn: 270679
2016-05-25 09:20:08 +00:00
Craig Topper f70a61ff3f [X86] Update test cases to make sure storeu builtins use the storeu instrinsics. We were previously matching on other stores in the IR from this being an -O0 test.
We should probably look into making the storeu builtins just emit a normal store with an alignment of 1.

llvm-svn: 270664
2016-05-25 05:26:23 +00:00
Hans Wennborg 9464491aa7 Rename test/CodeGen/inline-optim.cc to .c and provide a triple
llvm-svn: 270633
2016-05-24 23:37:56 +00:00
Hans Wennborg 7a00888a08 [Driver] Add support for -finline-functions and /Ob2 flags
-finline-functions and /Ob2 are currently ignored by Clang. The only way to
enable inlining is to use the global O flags, which also enable other options,
or to emit LLVM bitcode using Clang, then running opt by hand with the inline
pass.

This patch allows to simply use the -finline-functions flag (same as GCC) or
/Ob2 in clang-cl mode to enable inlining without other optimizations.

This is the first patch of a serie to improve support for the /Ob flags.

Patch by Rudy Pons <rudy.pons@ilod.org>!

Differential Revision: http://reviews.llvm.org/D20576

llvm-svn: 270609
2016-05-24 20:40:51 +00:00
David Majnemer a38c9f1fa5 [MS Volatile] Don't make volatile loads/stores to underaligned objects atomic
Underaligned atomic LValues require libcalls which MSVC doesn't have.
MSVC doesn't seem to consider such operations as requiring a barrier
anyway.

This fixes PR27843.

llvm-svn: 270576
2016-05-24 16:09:25 +00:00
Jacob Baungard Hansen 13a4937404 [Sparc] Add software float option -msoft-float
Summary:
Following patch D19265 which enable software floating point support in the Sparc backend, this patch enables the option to be enabled in the front-end using the -msoft-float option.

The user should ensure a library (such as the builtins from Compiler-RT) that includes the software floating point routines is provided.

Reviewers: jyknight, lero_chris

Subscribers: jyknight, cfe-commits

Differential Revision: http://reviews.llvm.org/D20419

llvm-svn: 270538
2016-05-24 08:30:08 +00:00
Simon Pilgrim 90770c7c76 [X86][SSE] Replace lossless i32/f32 to f64 conversion intrinsics with generic IR
Both the (V)CVTDQ2PD(Y) (i32 to f64) and (V)CVTPS2PD(Y) (f32 to f64) conversion instructions are lossless and can be safely represented as generic __builtin_convertvector calls instead of x86 intrinsics without affecting final codegen.

This patch removes the clang builtins and their use in the sse2/avx headers - a future patch will deal with removing the llvm intrinsics, but that will require a bit more work.

Differential Revision: http://reviews.llvm.org/D20528

llvm-svn: 270499
2016-05-23 22:13:02 +00:00
Michael Zuckerman f86eb71616 [clang][AVX512][Builtin] adding missing intrinsics for vpmultishiftqb{128|256|512} instruction set .
Differential Revision: http://reviews.llvm.org/D20521

llvm-svn: 270441
2016-05-23 15:04:39 +00:00
Michael Zuckerman e6542002fc [Clang][AVX512][BUILTIN]adding missing intrinsics for movdaq instruction set
Differential Revision: http://reviews.llvm.org/D20514

llvm-svn: 270401
2016-05-23 08:01:48 +00:00
Simon Pilgrim 28666ce778 [X86][AVX] Ensure zero-extension of _mm256_extract_epi8 and _mm256_extract_epi16
Ensure _mm256_extract_epi8 and _mm256_extract_epi16 zero extend their i8/i16 result to i32. This matches _mm_extract_epi8 and _mm_extract_epi16.

Fix for PR27594

Differential Revision: http://reviews.llvm.org/D20468

llvm-svn: 270330
2016-05-21 21:14:35 +00:00
Simon Pilgrim 8a8c4e1404 [X86][AVX] Added _mm256_testc_si256/_mm256_testnzc_si256/_mm256_testz_si256 tests
llvm-svn: 270227
2016-05-20 15:49:17 +00:00
Benjamin Kramer f4c520d5d2 Add all the avx512 flavors to __builtin_cpu_supports's list.
This is matching what trunk gcc is accepting. Also adds a missing ssse3
case. PR27779. The amount of duplication here is annoying, maybe it
should be factored into a separate .def file?

llvm-svn: 270224
2016-05-20 15:21:08 +00:00
Krzysztof Parzyszek 89fb44147b [Hexagon] Recognize "s" constraint in inline-asm
llvm-svn: 270216
2016-05-20 13:50:32 +00:00
Simon Pilgrim 4fa8250ad0 [X86][AVX] Added _mm256_extract_epi64 test
llvm-svn: 270212
2016-05-20 12:57:21 +00:00
Simon Pilgrim 94b17773e5 [X86][AVX] Full set of AVX intrinsics tests
llvm/test/CodeGen/X86/avx-intrinsics-fast-isel.ll will be synced to this

llvm-svn: 270210
2016-05-20 12:41:02 +00:00
Justin Lebar 2e4ecfdebe [CUDA] Implement __ldg using intrinsics.
Summary:
Previously it was implemented as inline asm in the CUDA headers.

This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions.  This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.

Reviewers: tra, rsmith

Subscribers: jhen, cfe-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D19990

llvm-svn: 270150
2016-05-19 22:49:13 +00:00
Benjamin Kramer 504c01cc67 Don't rely on value numbers in test, those are fragile and change in Release (no asserts) builds.
llvm-svn: 270085
2016-05-19 17:57:35 +00:00
Artem Belevich ffa5fc51b8 [CUDA] Allow sm_50,52,53 GPUs
LLVM accepts them since r233575.

Differential Revision: http://reviews.llvm.org/D20405

llvm-svn: 270084
2016-05-19 17:47:47 +00:00
Simon Pilgrim 9b3729b043 [X86][SSE] Sync with llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll
sse-builtins.c now just covers SSE1 intrinsics

llvm-svn: 270083
2016-05-19 17:11:31 +00:00
Simon Pilgrim bcf8846be5 [X86][SSE2] Fixed shuffle of results in _mm_cmpnge_sd/_mm_cmpngt_sd tests
llvm-svn: 270079
2016-05-19 16:48:59 +00:00
Ranjeet Singh b631aafee3 [ARM] Fix cdp intrinsic
- Fixed cdp intrinsic to only accept compile time
  constant values previously you could pass in a
  variable to the builtin which would result in
  illegal llvm assembly output

Differential Revision: http://reviews.llvm.org/D20394

llvm-svn: 270058
2016-05-19 13:04:34 +00:00
Michael Zuckerman 178113e8cc [Clang][AVX512][intrinsics] continue completing missing set intrinsics
Differential Revision: http://reviews.llvm.org/D20160

llvm-svn: 270047
2016-05-19 12:07:49 +00:00
Simon Pilgrim 97728dfb39 [X86][SSE2] Added _mm_move_* tests
llvm-svn: 270043
2016-05-19 11:18:49 +00:00
Simon Pilgrim cddcd2bd45 [X86][SSE2] Added _mm_cast* and _mm_set* tests
llvm-svn: 270042
2016-05-19 11:03:48 +00:00
Simon Pilgrim 3f64bb9618 [X86][SSE2] Sync with llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll
llvm-svn: 270034
2016-05-19 09:52:59 +00:00
Simon Pilgrim 063c57c1f9 Revert r269967 (SSE2 builtin checks) due to failed buildbots
llvm-svn: 269970
2016-05-18 18:22:20 +00:00
Simon Pilgrim 8beed747ce [X86][SSE2] Sync with llvm/test/CodeGen/X86/sse2-intrinsics-fast-isel.ll
llvm-svn: 269967
2016-05-18 18:12:34 +00:00
Michael Zuckerman 2cacc35343 [Clang][AVX512] completing missing intrinsics [pandnd].
Differential Revision: http://reviews.llvm.org/D20101

llvm-svn: 269939
2016-05-18 15:25:53 +00:00
Krzysztof Parzyszek e0026e4e21 [Hexagon] Recognize "q" and "v" in inline-asm as register constraints
Clang follow-up to r269933.

llvm-svn: 269934
2016-05-18 14:56:14 +00:00
Simon Pilgrim a090864762 Removed duplicate SSE42 builtin tests from avx-builtins.c
llvm-svn: 269932
2016-05-18 14:32:16 +00:00
Simon Pilgrim 519c78f3ae [X86][SSE42] Sync with llvm/test/CodeGen/X86/sse42-intrinsics-fast-isel.ll
llvm-svn: 269931
2016-05-18 14:29:55 +00:00
Simon Pilgrim 7a4d7d47c9 [X86][SSE41] Sync with llvm/test/CodeGen/X86/sse41-intrinsics-fast-isel.ll
llvm-svn: 269926
2016-05-18 13:47:16 +00:00
Simon Pilgrim 7e148a94a4 [X86][SSE3] Sync with llvm/test/CodeGen/X86/sse3-intrinsics-fast-isel.ll
llvm-svn: 269921
2016-05-18 13:17:39 +00:00
Ashutosh Nema 51c9dd0081 Add new intrinsic support for MONITORX and MWAITX instructions
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and 
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.

The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction
execution and enter an implementation-dependent optimized state until
occurrence of a class of events.

Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.

These instructions are enabled for AMD's bdver4 architecture.

Patch by Ganesh Gopalasubramanian!

Reviewers: echristo, craig.topper

Subscribers: RKSimon, joker.eph, llvm-commits, cfe-commits

Differential Revision: http://reviews.llvm.org/D19796

llvm-svn: 269907
2016-05-18 11:56:23 +00:00
Craig Topper 39c871038a [X86] Add immediate range checks for many of the builtins.
This time allow -128 to 255 for builtins that use a char type immediate."

llvm-svn: 269878
2016-05-18 03:18:12 +00:00
Simon Pilgrim 2d1decf7cb [X86][SSE] Tidied up MMX/SSE/SSE2 builtin tests to the correct test file
llvm-svn: 269852
2016-05-17 22:03:31 +00:00
Filipe Cabecinhas 09fbfcafc3 Revert "[X86] Add immediate range checks for many of the builtins."
This reverts commit r269619.

llvm-svn: 269765
2016-05-17 14:07:43 +00:00
Craig Topper dbbe4a5542 [AVX512] Fix return types in several test cases to match the intrinsic they're testing.
llvm-svn: 269738
2016-05-17 04:41:32 +00:00
Craig Topper 8ca5373c72 [X86] Fix a few intrinsic tests to use the return type that matches the intrinsic they're testing.
llvm-svn: 269735
2016-05-17 03:42:37 +00:00
Michael Zuckerman bf05a4589e [Clang][AVX512] completing missing intrinsics for [vpabs] instruction set
Differential Revision: http://reviews.llvm.org/D20069

llvm-svn: 269680
2016-05-16 18:57:24 +00:00
Nico Weber 379a1952b3 [ms] Reintroduce feature guards in intrinsic headers in Microsoft mode
Visual Studio's C++ standard library headers include intrin.h, so the intrinsic
headers get included a lot more often in Microsoft mode than elsewhere. The
AVX512 intrinsics are a lot of code (0.7 MB, causing 30% compile time overhead
for small programs including e.g. <string> and 6% compile time overhead for
larger projects like e.g. v8). Since multiversioning can't be relied on in
Microsoft mode (cl.exe doesn't support it), having faster compiles seems like
the much better tradeoff until we have a better intrinsic story going forward
(which we'll need for e.g. PR19898).

Actually using intrinsics on Windows already requires the right /arch:
settings, so this patch should have no big behavior change.

See also thread "The intrinsics headers (especially avx512) are too big. What
to do about it?" on cfe-dev.

http://reviews.llvm.org/D20291

llvm-svn: 269675
2016-05-16 18:14:07 +00:00