Commit Graph

7517 Commits

Author SHA1 Message Date
Simon Pilgrim cf340bd9c1 [X86][SSE] When lowering a 256-bit shuffle as PMOVZX, reduce the input vector to the lower 128-bit subvector.
Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads.

llvm-svn: 270858
2016-05-26 15:40:36 +00:00
Simon Pilgrim 50c37ceb3b [X86][SSE] Added load_zext_16i8_to_8i32 test
Odd issue with input vector not being folded into pmovzx on AVX2+ targets

llvm-svn: 270852
2016-05-26 14:45:30 +00:00
Igor Breger 8437bb70fd [AVX512] Fix intrinsic cmp{sd|ss} lowering.
Differential Revision: http://reviews.llvm.org/D20615

llvm-svn: 270843
2016-05-26 12:42:25 +00:00
Simon Pilgrim ab3809193c [X86][F16C] Added F16C fast-isel tests to match clang/test/CodeGen/f16c-builtins.c
llvm-svn: 270837
2016-05-26 10:26:56 +00:00
Simon Pilgrim 0e4fdc0842 [X86][AVX2] Added gather fast-isel tests to match clang/test/CodeGen/avx2-builtins.c
llvm-svn: 270835
2016-05-26 10:07:05 +00:00
Simon Pilgrim d6469e3467 [X86][SSE41] Removed pblendw intrinsics tests - they are auto-upgraded
Equivalent tests included in sse41-intrinsics-x86-upgrade.ll - the i8/i32 immediate diff doesn't matter anymore

llvm-svn: 270767
2016-05-25 21:27:58 +00:00
Simon Pilgrim fa814259ad [X86][SSE41] Regenerated intrinsics tests
llvm-svn: 270764
2016-05-25 21:21:51 +00:00
Simon Pilgrim 1bed207f88 [X86][SSE41] Removed blendpd/blendps intrinsics tests - they are auto-upgraded
Equivalent tests included in sse41-intrinsics-x86-upgrade.ll

llvm-svn: 270761
2016-05-25 21:06:36 +00:00
Simon Pilgrim 971abe8256 [X86][AVX2] Regenerate avx2 vector shift tests
llvm-svn: 270756
2016-05-25 21:00:40 +00:00
Rafael Espindola 84f0562064 Fix shouldAssumeDSOLocal for private linkage.
llvm-svn: 270746
2016-05-25 19:55:16 +00:00
Hal Finkel 6f3387f434 [SDAG] Add a fallback multiplication expansion
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.

Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.

This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).

Fixes PR19797.

llvm-svn: 270720
2016-05-25 16:50:22 +00:00
Sanjay Patel 3955360b24 [x86, AVX] allow explicit calls to VZERO* to modify state in VZeroUpperInserter pass (PR27823)
As noted in the review, there are still problems, so this doesn't the bug completely.

Differential Revision: http://reviews.llvm.org/D20529

llvm-svn: 270718
2016-05-25 16:39:47 +00:00
Simon Pilgrim 11081c98a3 [X86][AVX] Sync with clang/test/CodeGen/avx2-builtins.c
Only tests for the gather intrinsic are still to be added

llvm-svn: 270710
2016-05-25 15:30:08 +00:00
Simon Pilgrim 1bcf9847a4 [X86][AVX2] Added more fast-isel tests to match clang/test/CodeGen/avx2-builtins.c
llvm-svn: 270685
2016-05-25 10:56:23 +00:00
Simon Pilgrim c7dcbdc08a [X86][AVX2] Begun adding fast-isel tests to match clang/test/CodeGen/avx2-builtins.c
llvm-svn: 270683
2016-05-25 10:15:06 +00:00
Simon Pilgrim 4d1e258097 [X86][SSE2] Use storeu intrinsics for _mm_storeu_pd/_mm_storeu_pd tests
Also fixed name of _mm_store1_pd test

llvm-svn: 270681
2016-05-25 09:42:29 +00:00
Simon Pilgrim f0ba364fb9 [X86][SSE] Use storeu intrinsics for _mm_storeu_ps test
llvm-svn: 270680
2016-05-25 09:28:06 +00:00
Simon Pilgrim 4298d06d0f [X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.

Differential Revision: http://reviews.llvm.org/D20568

llvm-svn: 270678
2016-05-25 08:59:18 +00:00
Craig Topper 12e322a8cf [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time.
llvm-svn: 270677
2016-05-25 06:56:32 +00:00
Than McIntosh 879ad8fa99 Rework/enhance stack coloring data flow analysis.
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.

Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827

Bug: 25776
llvm-svn: 270559
2016-05-24 13:23:44 +00:00
Simon Pilgrim caf0d9d92c [X86][SSE] Added vector sitofp/uitofp folded load tests
llvm-svn: 270558
2016-05-24 13:07:23 +00:00
Igor Breger 23c2090606 [llvm][AVX512][intrinsics] Fix vperm{b|w|d|q|ps|pd} intrinsics. Index is second argument to buildin function but it is first instruction operand.
Differential Revision: http://reviews.llvm.org/D20515

llvm-svn: 270548
2016-05-24 11:06:22 +00:00
Simon Pilgrim 8a5ff3c59a [X86][SSE] Updated (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) fast-isel codegen to match D20528
llvm-svn: 270501
2016-05-23 22:17:36 +00:00
Simon Pilgrim 8cfcf586bb [X86][SSE] Added cvtdq2pd/cvtps2pd generic IR tests
Added D20528 implementations as well as existing x86 intrinsics versions

llvm-svn: 270494
2016-05-23 21:45:02 +00:00
Simon Pilgrim f615191fa6 [X86][SSE] Use shuffle/sext instead of deprecated (+ auto-upgraded) pmovsxwd intrinsic call
llvm-svn: 270489
2016-05-23 21:21:38 +00:00
Asaf Badouh d32e4c9f0d [X86][RTM] _xabort() should not have "noreturn" attribute
Differential Revision: http://reviews.llvm.org/D20518

llvm-svn: 270437
2016-05-23 14:04:17 +00:00
Simon Pilgrim 7cc9814aaf [X86][AVX] Added tests that access ymm registers before and after explicit vzeroupper/vzeroall calls
llvm-svn: 270434
2016-05-23 13:03:45 +00:00
Simon Pilgrim 4adbf23e1f [X86][SSE] Regenerated scalar load folding tests
llvm-svn: 270431
2016-05-23 12:53:09 +00:00
Simon Pilgrim 07002e86e3 [X86][SSE] Regenerated partial register update tests
llvm-svn: 270430
2016-05-23 12:49:37 +00:00
Simon Pilgrim e699370f3b [X86][SSE] Updated sse/avx cvtsi2sd tests to use non-constant value
llvm-svn: 270425
2016-05-23 12:41:51 +00:00
Simon Pilgrim e6f4d28d6a [X86][SSE2] Regenerated sse2 upgraded intrinsics tests
llvm-svn: 270423
2016-05-23 12:40:11 +00:00
Simon Pilgrim b24542c588 [X86][AVX] Regenerated avx upgraded intrinsics tests
llvm-svn: 270422
2016-05-23 12:39:06 +00:00
Craig Topper 95bdabd338 [AVX512] Add patterns to implement stores of extracts of least signficant subvectors using XMM or YMM stores instead of the vector extract instructions.
Similar is already done for AVX and we had lost it going to AVX512VL.

llvm-svn: 270383
2016-05-22 23:44:33 +00:00
Simon Pilgrim 1ced2a6390 [X86][SSE] Added extra i8 extract element test
llvm-svn: 270379
2016-05-22 20:35:42 +00:00
Sanjay Patel 2959ff4a88 [x86, AVX] don't add a vzeroupper if that's what the code is already doing (PR27823)
This isn't the complete fix, but it handles the trivial examples of duplicate vzero* ops in PR27823:
https://llvm.org/bugs/show_bug.cgi?id=27823
...and amusingly, the bogus cases already exist as regression tests, so let's take this baby step.

We'll need to do more in the general case where there's legitimate AVX usage in the function + there's
already a vzero in the code.

Differential Revision: http://reviews.llvm.org/D20477

llvm-svn: 270378
2016-05-22 20:22:47 +00:00
Sanjay Patel f71fc95173 [x86, AVX] add test file to show vzeroupper pass excesses
llvm-svn: 270375
2016-05-22 19:55:48 +00:00
Igor Breger 2ba64ab9ae [AVX512] Implement missing patterns for any_extend load lowering.
Differential Revision: http://reviews.llvm.org/D20513

llvm-svn: 270357
2016-05-22 10:21:04 +00:00
Craig Topper a1041ff001 [AVX512] Add an AddedComplexity line to the 512-bit insert_subvector undef index 0 patterns. This gives them higher priority than the memory patterns. This matches AVX1/2.
llvm-svn: 270355
2016-05-22 07:40:40 +00:00
Craig Topper dca03f8596 [X86] Add a common check-prefix to both run lines on a test so identical checks appear just once.
llvm-svn: 270345
2016-05-22 00:39:33 +00:00
Craig Topper 33c550cb95 [AVX512] Add a couple patterns to fix some cases where two vector mask inversions could appear in a row.
llvm-svn: 270344
2016-05-22 00:39:30 +00:00
Craig Topper db960eddfa [AVX512] Add patterns for extracting subvectors and storing to memory.
llvm-svn: 270334
2016-05-21 22:50:14 +00:00
Michael Zuckerman a63a129749 [Clang][AVX512][intrinsics] Fix rcp and sqrt intrinsics.
Differential Revision: http://reviews.llvm.org/D20438

llvm-svn: 270322
2016-05-21 14:44:18 +00:00
Michael Zuckerman 11b55b29d1 [Clang][AVX512][intrinsics] Fix vscalef intrinsics.
Differential Revision: http://reviews.llvm.org/D20324

llvm-svn: 270321
2016-05-21 11:09:53 +00:00
Craig Topper 02626c076b [AVX512] Add patterns for VEXTRACT v16i16->v8i16 and v32i8->v16i8. Disable AVX2 versions of vector extract when AVX512VL is enabled.
llvm-svn: 270318
2016-05-21 07:08:56 +00:00
Craig Topper 22ae353207 [AVX512] Disable AVX2 VPERMD, VPERMQ, VPERMPS, and VPERMPD patterns when AVX512VL is enabled. Also add shuffle comment printing for AVX512VL VPERMPD/VPERMQ to keep some tests that now use these instructions instead of the AVX2 ones.
llvm-svn: 270317
2016-05-21 06:07:18 +00:00
Craig Topper 6be70deda3 [AVX512] Disable AVX/AVX2 VBROADCASTSS/VBROADCASTSD patterns when AVX512VL is enabled.
llvm-svn: 270316
2016-05-21 05:47:25 +00:00
Craig Topper 1a23a521bb [AVX512] Use update_llc_test_checks to update some tests so we can see all the instruction encodings and ensure everything is with EVEX.
llvm-svn: 270315
2016-05-21 05:46:58 +00:00
Craig Topper 73f48f4662 [AVX512] Fix test cases I missed in r270311.
llvm-svn: 270313
2016-05-21 03:59:55 +00:00
Simon Pilgrim 55ef3da27b [X86][AVX] Generalized matching for target shuffle combines
This patch is a first step towards a more extendible method of matching combined target shuffle masks.

Initially this just pulls out the existing basic mask matches and adds support for some 256/512 bit equivalents. Future patterns will require a number of features to be added but I wanted to keep this patch simple.

I hope we can avoid duplication between shuffle lowering and combining and share more complex pattern match functions in future commits.

Differential Revision: http://reviews.llvm.org/D19198

llvm-svn: 270230
2016-05-20 16:19:30 +00:00
Simon Pilgrim acb71db577 [X86][AVX] Sync with clang/test/CodeGen/avx-builtins.c
llvm-svn: 270229
2016-05-20 16:05:55 +00:00