Commit Graph

7580 Commits

Author SHA1 Message Date
Igor Breger 61e628591f [AVX512] Fix load opcode for fast isel.
Differential Revision: http://reviews.llvm.org/D21067

llvm-svn: 272006
2016-06-07 13:08:45 +00:00
Simon Pilgrim ca1da1bf07 [X86][SSE] Improved blend+zero target shuffle combining to use combined shuffle mask directly
We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened.

This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases.

llvm-svn: 272003
2016-06-07 12:20:14 +00:00
Igor Breger edafb0595e [KNL] Fix UMULO lowering.
Differential Revision: http://reviews.llvm.org/D21013

llvm-svn: 271891
2016-06-06 12:24:52 +00:00
Craig Topper 33350cc406 [AVX512] Remove masked palignr intrinsics and auto-upgrade them to native IR of vector shuffle and select.
llvm-svn: 271872
2016-06-06 06:12:54 +00:00
Craig Topper 143446d5c1 [AVX512] Add PALIGNR shuffle lowering for v32i16 and v16i32.
llvm-svn: 271870
2016-06-06 05:39:10 +00:00
Craig Topper ccad6d57c1 [AVX512] Update tests to show shuffle decoding for vpshuflw/vpshufhw.
llvm-svn: 271869
2016-06-06 05:39:07 +00:00
Simon Pilgrim 64c6de4525 [X86][XOP] Added VPERMIL2PD/VPERMIL2PS raw mask decoding for target shuffle combines
llvm-svn: 271834
2016-06-05 15:21:30 +00:00
Simon Pilgrim 478295dadd [X86][XOP] Added VPERMIL2PD/VPERMIL2PS as a target shuffle type
llvm-svn: 271831
2016-06-05 15:01:45 +00:00
Craig Topper 8eeda57a40 [AVX512] Add support for lowering PALIGNR for v64i8.
Could do this for other types to, but this is what's needed to replace the instrinsic with native IR in clang.

llvm-svn: 271828
2016-06-05 06:29:12 +00:00
Craig Topper 5a315d4613 [AVX512] Split command lines and regenerate a test to prepare for a future commit.
llvm-svn: 271827
2016-06-05 06:29:08 +00:00
Craig Topper 9f51c9ef15 [AVX512] Fix PANDN combining for v4i32/v8i32 when VLX is enabled.
v4i32/v8i32 ANDs aren't promoted to v2i64/v4i64 when VLX is enabled.

llvm-svn: 271826
2016-06-05 05:35:11 +00:00
Simon Pilgrim 2ead861d07 [X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decoding
llvm-svn: 271809
2016-06-04 21:44:28 +00:00
Saleem Abdulrasool 1fcdc23a6e X86: enable TLS on Windows itanium
Windows itanium is nearly identical to windows-msvc (MS ABI for C, itanium for
C++).  Enable the TLS support for the target similar to the MSVC model.

llvm-svn: 271797
2016-06-04 18:27:22 +00:00
Simon Pilgrim fd2eda4f64 [X86][AVX2] Fix v16i16 SHL lowering (PR27730)
The AVX2 v16i16 shift lowering works by unpacking to 2 x v8i32, performing the shift and then truncating the result.

The unpacking is used to place the values in the upper 16-bits so that we can correctly sign-extend for SRA shifts. Unfortunately we weren't ensuring that the lower 16-bits were zero to ensure that SHL correctly shifts in zero bits.

llvm-svn: 271796
2016-06-04 16:45:33 +00:00
Simon Pilgrim ff35eecd90 [X86][AVX512] Fixed 512-bit vector nontemporal load alignment
llvm-svn: 271673
2016-06-03 14:12:43 +00:00
Simon Pilgrim f92d175a78 [X86][AVX512] Added 512-bit vector nontemporal load tests
llvm-svn: 271668
2016-06-03 13:42:49 +00:00
Simon Pilgrim a6022c9a63 [X86][SSE] Added nontemporal load tests
These currently all lower to regular loads, generic nontemporal load support will be added in a future patch

llvm-svn: 271659
2016-06-03 11:00:55 +00:00
Simon Pilgrim 960ca812ed [X86] Added nontemporal scalar store tests
llvm-svn: 271656
2016-06-03 10:30:54 +00:00
Simon Pilgrim 02284541b2 [X86][SSE] Regenerated nontemporal vector store tests and added extra target types
llvm-svn: 271654
2016-06-03 10:24:24 +00:00
Simon Pilgrim 38b4661b1b [X86] Regenerated nontemporal store tests and added tests for all 128-bit vector types
llvm-svn: 271651
2016-06-03 10:15:36 +00:00
Simon Pilgrim 205f65f62f [X86][AVX2] Relaxed alignment on nontemporal store tests
llvm-svn: 271646
2016-06-03 10:06:59 +00:00
Simon Pilgrim 8ea8940677 [X86][AVX2] Regenerated nontemporal store tests and added tests for all 256-bit vector types
llvm-svn: 271645
2016-06-03 09:56:24 +00:00
Simon Pilgrim e85506b6e0 [X86][XOP] Support for VPERMIL2PD/VPERMIL2PS 2-input shuffle instructions
This patch begins adding support for lowering to the XOP VPERMIL2PD/VPERMIL2PS shuffle instructions - adding the X86ISD::VPERMIL2 opcode and cleaning up the usage.

The internal llvm intrinsics were assuming the shuffle mask operand was the same type as the float/double input operands (I guess to simplify the intrinsic definitions in X86InstrXOP.td to a single value type). These needed changing to integer types (matching the clang builtin and the AMD intrinsics definitions), an auto upgrade path is added to convert old calls.

Mask decoding/target shuffle support will be added in future patches.

Differential Revision: http://reviews.llvm.org/D20049

llvm-svn: 271633
2016-06-03 08:06:03 +00:00
Craig Topper e7ae106147 [AVX512] Ensure EVEX vpshufd, vpshuflw, and vpshufhw have isel priority over the VEX encoded ones.
llvm-svn: 271629
2016-06-03 05:31:04 +00:00
Craig Topper 01f53b1773 [AVX512] Fix shuffle comment printing for EVEX encoded PSHUFD, PSHUFHW, and PSHUFLW.
llvm-svn: 271628
2016-06-03 05:31:00 +00:00
Simon Pilgrim ab95b2fe26 [X86][SSE] Added SSE41/AVX2 non-temporal tests
Useful for when we add MOVNTDQA support

llvm-svn: 271552
2016-06-02 18:01:21 +00:00
Dimitry Andric 6a482a73d6 Only attempt to detect AVG if SSE2 is available
Summary:
In PR29973 Sanjay Patel reported an assertion failure when a certain
loop was optimized, for a target without SSE2 support.  It turned out
this was because of the AVG pattern detection introduced in rL253952.

Prevent the assertion failure by bailing out early in
`detectAVGPattern()`, if the target does not support SSE2.

Also add a minimized test case.

Reviewers: congh, eli.friedman, spatel

Subscribers: emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D20905

llvm-svn: 271548
2016-06-02 17:30:49 +00:00
Sanjay Patel f509d85a6d [DAG] use getBitcast() to reduce code
Although this was intended to be NFC, the test case wiggle shows a change in
code scheduling/RA caused by a difference in the SDLoc() generation.

Depending on how you look at it, this is the (dis)advantage of exact checking
in regression tests.

llvm-svn: 271526
2016-06-02 16:01:15 +00:00
Simon Pilgrim ebdc397c86 [X86][SSE] Added non-temporal load tests for vector types
These currently lower to regular loads instead of MOVNTDQA

llvm-svn: 271516
2016-06-02 13:51:50 +00:00
Simon Pilgrim 0afd5a4d80 [X86][SSE] Replace (V)CVTTPS2DQ and VCVTTPD2DQ truncating (round to zero) f32/f64 to i32 with generic IR (llvm)
This patch removes the llvm intrinsics (V)CVTTPS2DQ and VCVTTPD2DQ truncation (round to zero) conversions and auto-upgrades to FP_TO_SINT calls instead.

Note: I looked at updating CVTTPD2DQ as well but this still requires a lot more work to correctly lower.

Differential Revision: http://reviews.llvm.org/D20860

llvm-svn: 271510
2016-06-02 10:55:21 +00:00
Craig Topper ca9c0801e1 [X86] Add AVX 256-bit load and stores to fast isel.
I'm not sure why this was missing for so long.

This also exposed that we were picking floating point 256-bit VMOVNTPS for some integer types in normal isel for AVX1 even though VMOVNTDQ is available. In practice it doesn't matter due to the execution dependency fix pass, but it required extra isel patterns. Fixing that in a follow up commit.

llvm-svn: 271481
2016-06-02 04:19:45 +00:00
Craig Topper f10fbfa738 [AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked loads.

llvm-svn: 271478
2016-06-02 04:19:36 +00:00
Sanjay Patel b4a4357ecb [x86, AVX2] regenerate checks
llvm-svn: 271434
2016-06-01 21:32:56 +00:00
Michael Kuperstein 738ae45ce8 [DAG] Improve legalization of INSERT_SUBVECTOR
When the index is known to be constant 0, insert directly into the the low half,
instead of spilling, performing the insert in-memory, and reloading.

Differential Revision: http://reviews.llvm.org/D20763

llvm-svn: 271428
2016-06-01 20:49:35 +00:00
Than McIntosh 4ef761aa35 Better fix for PR27903.
Summary:
Re-enable lifetime-start-on-first-use for stack coloring,
but explicitly disable it for slots with more than one start
or end lifetime marker.

Bug: 27903

Reviewers: wmi, tejohnson, qcolombet, gbiv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D20739

llvm-svn: 271412
2016-06-01 17:55:10 +00:00
Simon Pilgrim 1cd61b82bd [X86][SSE] Added non-temporal store tests for all 512-bit vector types
llvm-svn: 271393
2016-06-01 13:58:00 +00:00
Simon Pilgrim 288be8bab6 [X86][SSE] Added non-temporal store tests for all 256-bit vector types
Also added KNL AVX-512 checks

llvm-svn: 271391
2016-06-01 13:20:25 +00:00
Simon Pilgrim 80f5335969 [X86][SSE] Added non-temporal store tests for all 128-bit integer vector types
llvm-svn: 271389
2016-06-01 13:05:00 +00:00
Michael Zuckerman 6a894956fc Adding back-end support to two bit scanning intrinsics
Adding LLVM back-end support to two intrinsics dealing with bit scan: _bit_scan_forward and _bit_scan_reverse.
Their functionality is as described in Intel intrinsics guide:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_forward&expand=371,370
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_bit_scan_reverse&expand=371,370

Commit on behalf of Omer Paparo Bivas


Differential Revision: http://reviews.llvm.org/D19915

llvm-svn: 271386
2016-06-01 12:02:37 +00:00
Craig Topper 4f2d5a68d3 Revert r271362 "[AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead."
Looks like something isn't quite right still. Also forgot to move the test cases to an autoupgrade test.

llvm-svn: 271363
2016-06-01 05:57:55 +00:00
Craig Topper dacd9d2bac [AVX512] Remove masked load intrinsics. Clang now emits generic masked load intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked loads.

llvm-svn: 271362
2016-06-01 05:35:16 +00:00
Kevin B. Smith ed0b620a65 [X86]: Add a pattern that uses GR16_ABCD rather than GR32_ABCD to avoid falsely marking whole 32 bit register as live.
Differential Revision: http://reviews.llvm.org/D20649

llvm-svn: 271341
2016-05-31 22:00:12 +00:00
Simon Pilgrim e05dc45897 [X86][SSE] Add load-folding patterns for (V)CVTDQ2PD (PR27291)
Added patterns for (V)CVTDQ2PD -> 2f64 loading from a 64-bit source.

llvm-svn: 271269
2016-05-31 12:04:35 +00:00
Igor Breger 73ee8ba9b0 [AVX512] Fix intrinsic vcvtps2ph lowering.
Differential Revision: http://reviews.llvm.org/D20788

llvm-svn: 271255
2016-05-31 08:04:21 +00:00
Igor Breger 52bd1d5fcc Fix intrinsic vbroadcast{i32|f32}x2 lowering.
Differential Revision: http://reviews.llvm.org/D20780

llvm-svn: 271254
2016-05-31 07:43:39 +00:00
Craig Topper 50f85c22c5 [AVX512] Remove masked store intrinsics. Clang now emits generic masked store intrinsics instead.
The intrinsics will be autoupgraded to the same generic masked stores.

llvm-svn: 271245
2016-05-31 01:50:02 +00:00
Saleem Abdulrasool d2f705ddf9 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Craig Topper 8287fd8abd [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
llvm-svn: 271236
2016-05-30 23:15:56 +00:00
Craig Topper 39716f8358 [X86] Use update_llc_test_checks.py to re-generate a test in preparation for an upcoming commit. NFC
llvm-svn: 271234
2016-05-30 22:54:14 +00:00
Simon Pilgrim d788c9d83d [X86][XOP] Split off auto-upgraded xop intrinsics
llvm-svn: 271228
2016-05-30 19:50:56 +00:00