Commit Graph

13202 Commits

Author SHA1 Message Date
Simon Pilgrim 4642a57fbf Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
llvm-svn: 270976
2016-05-27 09:02:25 +00:00
Simon Pilgrim c013e5737b [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

A companion patch (D20684) removes/auto-upgrade the clang intrinsics.

Differential Revision: http://reviews.llvm.org/D20686

llvm-svn: 270973
2016-05-27 08:49:15 +00:00
Simon Pilgrim cf340bd9c1 [X86][SSE] When lowering a 256-bit shuffle as PMOVZX, reduce the input vector to the lower 128-bit subvector.
Most often as not this is what it started out as, the extraction is zero-cost on AVX and the PMOVZX/PMOVSX folding logic is based around 128-bit loads.

llvm-svn: 270858
2016-05-26 15:40:36 +00:00
Rafael Espindola a224de06bc Use shouldAssumeDSOLocal on AArch64.
This reduces code duplication and now AArch64 also handles PIE.

llvm-svn: 270844
2016-05-26 12:42:55 +00:00
Igor Breger 8437bb70fd [AVX512] Fix intrinsic cmp{sd|ss} lowering.
Differential Revision: http://reviews.llvm.org/D20615

llvm-svn: 270843
2016-05-26 12:42:25 +00:00
Simon Pilgrim 4810683ddf Simplify std::all_of/any_of predicates by using llvm::all_of/any_of. NFCI.
llvm-svn: 270753
2016-05-25 20:41:11 +00:00
Rafael Espindola 84f0562064 Fix shouldAssumeDSOLocal for private linkage.
llvm-svn: 270746
2016-05-25 19:55:16 +00:00
Sanjay Patel aedc347b29 [x86] avoid code explosion from LoopVectorizer for gather loop (PR27826)
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826

There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.

Differential Revision: http://reviews.llvm.org/D20601

llvm-svn: 270729
2016-05-25 17:27:54 +00:00
Sanjay Patel 3955360b24 [x86, AVX] allow explicit calls to VZERO* to modify state in VZeroUpperInserter pass (PR27823)
As noted in the review, there are still problems, so this doesn't the bug completely.

Differential Revision: http://reviews.llvm.org/D20529

llvm-svn: 270718
2016-05-25 16:39:47 +00:00
Simon Pilgrim 4298d06d0f [X86][SSE] Replace (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) lossless conversion intrinsics with generic IR
Followup to D20528 clang patch, this removes the (V)CVTDQ2PD(Y) and (V)CVTPS2PD(Y) llvm intrinsics and auto-upgrades to sitofp/fpext instead.

Differential Revision: http://reviews.llvm.org/D20568

llvm-svn: 270678
2016-05-25 08:59:18 +00:00
Craig Topper 12e322a8cf [X86] Remove the llvm.x86.sse2.storel.dq intrinsic. It hasn't been used in a long time.
llvm-svn: 270677
2016-05-25 06:56:32 +00:00
Igor Breger 23c2090606 [llvm][AVX512][intrinsics] Fix vperm{b|w|d|q|ps|pd} intrinsics. Index is second argument to buildin function but it is first instruction operand.
Differential Revision: http://reviews.llvm.org/D20515

llvm-svn: 270548
2016-05-24 11:06:22 +00:00
Simon Pilgrim 14000b3cea [CostModel][X86][XOP] Added XOP costmodel for BITREVERSE
Now that we have a nice fast VPPERM solution. Added framework for future intrinsic costs as well.

llvm-svn: 270537
2016-05-24 08:17:50 +00:00
Sanjay Patel 8099fb7e0a fix typo; NFC
llvm-svn: 270469
2016-05-23 18:01:20 +00:00
Sanjay Patel 13a0d49813 use range-loop; NFCI
llvm-svn: 270467
2016-05-23 18:00:50 +00:00
Aaron Ballman a81264ba09 Removing a switch statement that contains only a default label; NFC.
llvm-svn: 270444
2016-05-23 15:52:59 +00:00
Craig Topper a6d0104823 [X86] Use instruction aliases to replace custom asm parser code for optimizing moves to use 2 byte VEX prefix.
llvm-svn: 270394
2016-05-23 04:02:27 +00:00
Craig Topper 95bdabd338 [AVX512] Add patterns to implement stores of extracts of least signficant subvectors using XMM or YMM stores instead of the vector extract instructions.
Similar is already done for AVX and we had lost it going to AVX512VL.

llvm-svn: 270383
2016-05-22 23:44:33 +00:00
Sanjay Patel 2959ff4a88 [x86, AVX] don't add a vzeroupper if that's what the code is already doing (PR27823)
This isn't the complete fix, but it handles the trivial examples of duplicate vzero* ops in PR27823:
https://llvm.org/bugs/show_bug.cgi?id=27823
...and amusingly, the bogus cases already exist as regression tests, so let's take this baby step.

We'll need to do more in the general case where there's legitimate AVX usage in the function + there's
already a vzero in the code.

Differential Revision: http://reviews.llvm.org/D20477

llvm-svn: 270378
2016-05-22 20:22:47 +00:00
Igor Breger 2ba64ab9ae [AVX512] Implement missing patterns for any_extend load lowering.
Differential Revision: http://reviews.llvm.org/D20513

llvm-svn: 270357
2016-05-22 10:21:04 +00:00
Craig Topper 5f3fef884f [AVX512] The AVX512 file only need subtract_subvector index 0 patterns where the source is 512-bits. The 256-bit source patterns were redundant with AVX.
llvm-svn: 270356
2016-05-22 07:40:58 +00:00
Craig Topper a1041ff001 [AVX512] Add an AddedComplexity line to the 512-bit insert_subvector undef index 0 patterns. This gives them higher priority than the memory patterns. This matches AVX1/2.
llvm-svn: 270355
2016-05-22 07:40:40 +00:00
Craig Topper de5498546e [AVX512] Change the AddedComplexity on some patterns to match their AVX/SSE equivalents. This helps group them close together in the isel tables and enable table compression.
llvm-svn: 270354
2016-05-22 06:09:34 +00:00
Craig Topper 33c550cb95 [AVX512] Add a couple patterns to fix some cases where two vector mask inversions could appear in a row.
llvm-svn: 270344
2016-05-22 00:39:30 +00:00
Craig Topper dbac1ff9c1 [AVX512] Remove seemingly unnecessary AddedComplexity adjustment.
llvm-svn: 270343
2016-05-22 00:39:27 +00:00
Craig Topper 3fc3b4453d [X86] Remove unnecessary alignment check on patterns that use VEXTRACTF128 for integer types when only AVX1 is supported.
llvm-svn: 270335
2016-05-21 22:50:18 +00:00
Craig Topper db960eddfa [AVX512] Add patterns for extracting subvectors and storing to memory.
llvm-svn: 270334
2016-05-21 22:50:14 +00:00
Craig Topper 03b849eb44 [AVX512] Capitalize the Z in VEXTRACTPSzmr. Lowercase z has been primarily used to indicating the zero masking behavior which is not the case here. NFC
llvm-svn: 270333
2016-05-21 22:50:11 +00:00
Craig Topper d5da6a39f2 [AVX512] Rename vector extract instructions so 'mr' intead of 'rm' to reflect the fact that memory is the destination.
llvm-svn: 270332
2016-05-21 22:50:09 +00:00
Craig Topper 08a6857c82 [AVX512] Fix copy/paste mistake a I made in a comment.
llvm-svn: 270331
2016-05-21 22:50:04 +00:00
Michael Zuckerman a63a129749 [Clang][AVX512][intrinsics] Fix rcp and sqrt intrinsics.
Differential Revision: http://reviews.llvm.org/D20438

llvm-svn: 270322
2016-05-21 14:44:18 +00:00
Michael Zuckerman 11b55b29d1 [Clang][AVX512][intrinsics] Fix vscalef intrinsics.
Differential Revision: http://reviews.llvm.org/D20324

llvm-svn: 270321
2016-05-21 11:09:53 +00:00
Craig Topper 02626c076b [AVX512] Add patterns for VEXTRACT v16i16->v8i16 and v32i8->v16i8. Disable AVX2 versions of vector extract when AVX512VL is enabled.
llvm-svn: 270318
2016-05-21 07:08:56 +00:00
Craig Topper 22ae353207 [AVX512] Disable AVX2 VPERMD, VPERMQ, VPERMPS, and VPERMPD patterns when AVX512VL is enabled. Also add shuffle comment printing for AVX512VL VPERMPD/VPERMQ to keep some tests that now use these instructions instead of the AVX2 ones.
llvm-svn: 270317
2016-05-21 06:07:18 +00:00
Craig Topper 6be70deda3 [AVX512] Disable AVX/AVX2 VBROADCASTSS/VBROADCASTSD patterns when AVX512VL is enabled.
llvm-svn: 270316
2016-05-21 05:47:25 +00:00
Craig Topper 97565ded80 [AVX512] Disable AVX/AVX2 patterns for VPSADBW and VPMULUDQ when the AVX512VL/AVX512BWI equivalents are available.
llvm-svn: 270311
2016-05-21 03:52:32 +00:00
Craig Topper b395105584 [X86] Convert some SSE2/AVX2 intrinsics to ISD opcodes during lowering instead of pattern matching the intrinsics. This unifies handling with AVX512 and allows these intrinsics to select EVEX encoded instructions to increase available registers.
llvm-svn: 270310
2016-05-21 03:52:28 +00:00
David Majnemer 498f2fd11b Address post-review for r270246
This gets rid of some unnecessary SmallStrings in
X86TargetMachine::getSubtargetImpl.

No functionality change is intended.

llvm-svn: 270270
2016-05-20 20:41:24 +00:00
David Majnemer ca29023b02 [X86] Reduce memory allocations in X86TargetMachine::getSubtargetImpl
We performed a number of memory allocations each time getTTI was called,
remove them by using SmallString.
No functionality change intended.

llvm-svn: 270246
2016-05-20 18:16:06 +00:00
Sanjay Patel 5496a23316 fix comments; NFC
llvm-svn: 270237
2016-05-20 17:07:19 +00:00
Sanjay Patel 1dc57cb944 use range-loops; NFCI
llvm-svn: 270236
2016-05-20 17:00:10 +00:00
Sanjay Patel 8bc63b2f47 fix documentation comments; NFC
llvm-svn: 270234
2016-05-20 16:46:01 +00:00
Simon Pilgrim 55ef3da27b [X86][AVX] Generalized matching for target shuffle combines
This patch is a first step towards a more extendible method of matching combined target shuffle masks.

Initially this just pulls out the existing basic mask matches and adds support for some 256/512 bit equivalents. Future patterns will require a number of features to be added but I wanted to keep this patch simple.

I hope we can avoid duplication between shuffle lowering and combining and share more complex pattern match functions in future commits.

Differential Revision: http://reviews.llvm.org/D19198

llvm-svn: 270230
2016-05-20 16:19:30 +00:00
Rafael Espindola c7e9813228 Refactor X86 symbol access classification.
This refactors the logic in X86 to avoid code duplication. It also
splits it in two steps: it first decides if a symbol is local to the DSO
and then uses that information to decide how to access it.

The first part is implemented by shouldAssumeDSOLocal. It is not in any
way specific to X86. In a followup patch I intend to move it to
somewhere common and reused it in other backends.

llvm-svn: 270209
2016-05-20 12:20:10 +00:00
Craig Topper b182715a52 [X86] Fix another AVX pattern to only be disable if VLX and BWI are supported.
llvm-svn: 270182
2016-05-20 05:10:27 +00:00
Craig Topper 0a7a8dee2b [X86] Fix some AVX patterns to only be disabled if VLX and BWI are supported. Without this we get isel failures on the avx-intrinsics-x86.ll test in AVX512VL.
llvm-svn: 270174
2016-05-20 02:00:08 +00:00
Rafael Espindola ab03eb007c Record a TargetMachine instead of a Reloc::Model.
Addresses r270095's code review.

llvm-svn: 270147
2016-05-19 22:07:57 +00:00
Hans Wennborg 172eee9cfc X86: Don't reset the stack after calls that don't return (PR27117)
Since the calls don't return, the instruction afterwards will never run,
and is just taking up unnecessary space in the binary.

Differential Revision: http://reviews.llvm.org/D20406

llvm-svn: 270109
2016-05-19 20:15:33 +00:00
Rafael Espindola 46107b9e62 Remember the relocation model. NFC.
This avoids passing a TargetMachine in a few places.

llvm-svn: 270095
2016-05-19 18:49:29 +00:00
Rafael Espindola cb2d266360 Style fixes. NFC.
llvm-svn: 270093
2016-05-19 18:34:20 +00:00