Commit Graph

7750 Commits

Author SHA1 Message Date
Wei Mi 90d195a5fd [PM] Port UnreachableBlockElim to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D22124

llvm-svn: 274824
2016-07-08 03:32:49 +00:00
Michael Kuperstein 3e3652aef2 Recommit r274692 - [X86] Transform setcc + movzbl into xorl + setcc
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.
This fixes PR28146.

The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD)
which was not appreciated by fast regalloc on 32-bit.

llvm-svn: 274802
2016-07-07 22:50:23 +00:00
Tim Northover 1d106c5fc2 tests: accept different TargetOpcode values.
These tests don't actually care about the internal opcode number, but have to
be updated whenever we add a new one for GlobalISel. That's bad.

llvm-svn: 274774
2016-07-07 17:51:42 +00:00
Michael Kuperstein edb38a94f8 Revert r274692 to check whether this is what breaks windows selfhost.
llvm-svn: 274771
2016-07-07 16:55:35 +00:00
Craig Topper d5d2a35013 [AVX512] Zero extend the result of vpcmpeq/vpcmpgt and similar intrinsics in the autoupgrade code. This currently results in worse codegen but is needed for correctness.
llvm-svn: 274736
2016-07-07 06:11:07 +00:00
Michael Kuperstein 1ef6c59b1d [X86] Transform setcc + movzbl into xorl + setcc
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.

This fixes PR28146.

Differential Revision: http://reviews.llvm.org/D21774

llvm-svn: 274692
2016-07-06 21:56:18 +00:00
Simon Pilgrim 118da63a9d [X86][SSE] Added test cases for missed opportunities to combine pshufb to pslldq/psrldq
llvm-svn: 274631
2016-07-06 15:09:48 +00:00
Elena Demikhovsky ad0a56f3da Re-commit of 274613.
The prev commit failed on compilation.
A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure.

llvm-svn: 274626
2016-07-06 14:15:43 +00:00
Elena Demikhovsky 02ced295aa Reverted 274613 due to compilation failue.
llvm-svn: 274615
2016-07-06 09:11:49 +00:00
Elena Demikhovsky 5a4f2476fd AVX-512: Optimization for patterns with i1 scalar type
The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc".
I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction.
I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions.

This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173.

Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails).

Differential revision: http://reviews.llvm.org/D21956

llvm-svn: 274613
2016-07-06 09:01:20 +00:00
Simon Pilgrim bec6543d17 [X86][AVX2] Add support for target shuffle combining to BROADCAST
Only support broadcast from vector register so far - memory folding support will have to wait.

llvm-svn: 274572
2016-07-05 20:11:29 +00:00
Simon Pilgrim 48adedffb7 [X86][AVX512] Fixed decoding of permd/permpd variable mask shuffles + enabled them for target shuffle combining
Corrected element mask masking to extract the bottom index bits (now matches the perm2 implementation but for unary inputs).

llvm-svn: 274571
2016-07-05 18:31:17 +00:00
Simon Pilgrim 4e96fbf3c1 [X86][AVX512] Autoupgrade the BROADCAST intrinsics
llvm-svn: 274550
2016-07-05 13:58:47 +00:00
Simon Pilgrim 1e91654b38 [X86][AVX512BW] Added BROADCAST intrinsics fast-isel generic IR tests
llvm-svn: 274545
2016-07-05 13:16:05 +00:00
Simon Pilgrim 20ede63a33 [X86][AVX512] Added BROADCAST intrinsics fast-isel generic IR tests
llvm-svn: 274537
2016-07-05 10:15:14 +00:00
Simon Pilgrim dea33cc2f3 [X86][AVX512] Added VSHUFPD intrinsics fast-isel generic IR tests
llvm-svn: 274534
2016-07-05 09:10:07 +00:00
Simon Pilgrim 8a01915bd2 [X86][AVX512VL] Added VSHUFPD/VSHUFPS intrinsics fast-isel generic IR tests
llvm-svn: 274533
2016-07-05 09:09:41 +00:00
Simon Pilgrim 3ad040909a [X86][AVX512] Add support for lowering shuffles to VSHUFPD
llvm-svn: 274520
2016-07-04 20:41:24 +00:00
Simon Pilgrim 02d435d2f4 [X86][AVX512] Autoupgrade the VPERMPD/VPERMQ intrinsics
llvm-svn: 274506
2016-07-04 14:19:05 +00:00
Simon Pilgrim 8b82fce537 [X86][AVX512] Added VPERMPD/VPERMQ intrinsics fast-isel generic IR tests
llvm-svn: 274503
2016-07-04 13:43:10 +00:00
Simon Pilgrim 9fca300cbe [X86][AVX512] Autoupgrade the VPERMILPD/VPERMILPS intrinsics
llvm-svn: 274498
2016-07-04 12:40:54 +00:00
Simon Pilgrim c8cf2ddb6d [X86][AVX512] Added VPERMILPD/VPERMILPS intrinsics fast-isel generic IR tests
Added PSHUFD tests as well

llvm-svn: 274493
2016-07-04 11:07:50 +00:00
Craig Topper d83f818a3e [CodeGen] Make the code that detects a if a shuffle is really a concatenation of the inputs more general purpose.
We can now handle concatenation of each source multiple times. The previous code just checked for each source to appear once in either order.

This also now handles an entire source vector sized piece having undef indices correctly. We now concat with UNDEF instead of using one of the sources. This is responsible for the test case change.

llvm-svn: 274483
2016-07-04 06:19:35 +00:00
Simon Pilgrim 7f096de0b8 [X86][AVX512] Add support for 512-bit shuffle lowering to VPERMPD/VPERMQ
llvm-svn: 274473
2016-07-03 19:50:06 +00:00
Craig Topper d1eca0f32c [CodeGen] Teach OR combine of shuffles involving zero vectors to better handle undef indices.
Undef indices can now be treated as zeros. Or if its undef ORed with zero, we will keep the undef.

llvm-svn: 274472
2016-07-03 19:37:12 +00:00
Craig Topper 8e826d5abe [X86] Add tests to show that the DAG combine for OR of shuffles with zero vectors doesn't handle undefs as well as it could. Fix coming in another commit.
llvm-svn: 274471
2016-07-03 19:37:10 +00:00
Haicheng Wu b71b2f622a [MBB] add a missing corner case in UpdateTerminator()
After the block placement, if a block ends with a conditional branch, but the
next block is not its successor. The conditional branch should be changed to
unconditional branch.  This patch fixes PR28307, PR28297, PR28402.

Differential Revision: http://reviews.llvm.org/D21811

llvm-svn: 274470
2016-07-03 19:14:17 +00:00
Simon Pilgrim 68ea80649b [X86][AVX512] Add support for VPERMPD/VPERMQ masked shuffle comments
llvm-svn: 274469
2016-07-03 18:40:24 +00:00
Simon Pilgrim a0d73835b2 [X86][AVX512] Add support for 512-bit shuffle decoding of VPERMPD/VPERMQ
llvm-svn: 274468
2016-07-03 18:27:37 +00:00
Simon Pilgrim dbd6db0dc7 [X86][AVX512] Add support for VPALIGNR/PSHUFD/PSHUFHW/PSHUFLW masked shuffle comments
llvm-svn: 274466
2016-07-03 15:00:51 +00:00
Simon Pilgrim 598bdb6bfe [X86][AVX512] Add support for UNPCK masked shuffle comments
llvm-svn: 274464
2016-07-03 14:26:21 +00:00
Simon Pilgrim 1f59076196 [X86][AVX512] Add support for VPERM/VSHUF masked shuffle comments
llvm-svn: 274462
2016-07-03 13:55:41 +00:00
Simon Pilgrim 68f438a036 [X86][AVX512] Add support for PMOVZX masked shuffle comments
llvm-svn: 274461
2016-07-03 13:33:28 +00:00
Simon Pilgrim 7c2fbdc101 [X86][AVX512] Add support for masked shuffle comments
This patch adds support for including the avx512 mask register information in the mask/maskz versions of shuffle instruction comments.

This initial version just adds support for MOVDDUP/MOVSHDUP/MOVSLDUP to reduce the mass of test regenerations, other shuffle instructions can be added in due course.

Differential Revision: http://reviews.llvm.org/D21953

llvm-svn: 274459
2016-07-03 13:08:29 +00:00
Simon Pilgrim 129b720c18 [X86][AVX512] Add support for lowering shuffles to VPERMILPS
llvm-svn: 274458
2016-07-03 12:47:21 +00:00
Simon Pilgrim 99e8a1aa0b [X86][AVX512] Add support for lowering shuffles to VPERMILPD
llvm-svn: 274450
2016-07-02 20:20:12 +00:00
Simon Pilgrim 72052f6de9 [X86][AVX512VL] Add fast-isel MOVDDUP/MOVSLDUP/MOVSHDUP shuffle tests
llvm-svn: 274448
2016-07-02 19:22:46 +00:00
Simon Pilgrim cde7c54baa [X86][AVX512] Add support for 512-bit PSHUFB lowering
llvm-svn: 274444
2016-07-02 18:14:31 +00:00
Simon Pilgrim 77dda7c2e0 [X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR
llvm-svn: 274443
2016-07-02 17:16:41 +00:00
Simon Pilgrim 19adee9d84 [X86][AVX512] Autoupgrade the MOVDDUP/MOVSLDUP/MOVSHDUP intrinsics
llvm-svn: 274439
2016-07-02 14:42:35 +00:00
Simon Pilgrim f040d8c061 [X86][AVX512] Add support for lowering shuffles to MOVDDUP/MOVSLDUP/MOVSHDUP
llvm-svn: 274436
2016-07-02 12:45:03 +00:00
Simon Pilgrim 5e95390957 [X86][AVX512] Add test cases that should lower to MOVSLDUP/MOVSHDUP
llvm-svn: 274435
2016-07-02 12:20:35 +00:00
Simon Pilgrim a6f262a1f9 [X86][AVX512] Add fast-isel shuffle tests
Its not worth trying to write out tests for all the avx512f builtins yet, just adding tests for lowering of generic IR as we transition to it (shuffles mainly right now).

llvm-svn: 274434
2016-07-02 12:13:29 +00:00
Dehao Chen 7b2c997736 Specify mtriple for the frame-order.ll test.
Summary: original test may have different bahavior on different bot, specifically it broke llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21931

llvm-svn: 274368
2016-07-01 17:35:13 +00:00
Dehao Chen ad2b4e1334 Do not count debug instructions when counting number of uses to reorder frame objects.
Summary: The code generation should be independent of the debug info.

Reviewers: zansari, davidxl, mkuper, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D21911

llvm-svn: 274357
2016-07-01 15:40:25 +00:00
Yunzhong Gao b386955adc Add an artificial line-0 debug location when the compiler emits a call to
__stack_chk_fail(). This avoids a compiler crash.

Differential Revision: http://reviews.llvm.org/D21818

llvm-svn: 274263
2016-06-30 18:49:04 +00:00
Ahmed Bougacha 15a2f6d58c [X86] Lower blended PACKUSes using appropriate types.
When lowering two blended PACKUS, we used to disregard the types
of the PACKUS inputs, indiscriminately generating a v16i8 PACKUS.

This leads to non-selectable things like:
    (v16i8 (PACKUS (v4i32 v0), (v4i32 v1)))

Instead, check that the PACKUSes have the same type, and use that
as the final result type.

llvm-svn: 274138
2016-06-29 16:56:09 +00:00
Rafael Espindola c4cabb8054 Update tests to use at least darwin9.
llvm-svn: 274129
2016-06-29 14:51:10 +00:00
Simon Pilgrim f9c5908ffd [X86][SSE2] Added _mm_loadu_si64 test to match llvm\tools\clang\test\CodeGen\sse2-builtins.c
llvm-svn: 274127
2016-06-29 14:05:33 +00:00
Simon Pilgrim 851019175b [X86] Regenerated popcnt combine tests
llvm-svn: 274124
2016-06-29 13:54:03 +00:00