Craig Topper
283418fbb6
[AVX512] Add patterns for any-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX.
...
llvm-svn: 273253
2016-06-21 07:37:32 +00:00
Igor Breger
e59165ca63
[AVX512] [AVX512/AVX][Intrinsics] Fix Variable Bit Shift Right Arithmetic intrinsic lowering.
...
Differential Revision: http://reviews.llvm.org/D20897
llvm-svn: 273138
2016-06-20 07:05:43 +00:00
Michael Kuperstein
18d6d3d95e
[X86] Add missing AVX512 anyext patterns.
...
Add AVX512 anyext patterns for i16 and i64, modeled on the existing i8 and
i32 patterns.
llvm-svn: 273038
2016-06-17 20:21:17 +00:00
Igor Breger
64cfd3a442
[AVX512] Fix BLENDM lowering patterns. Operands should be swapped to match SELECT behavior.
...
Use BLENDM instead of masked move instruction.
Differential Revision: http://reviews.llvm.org/D21001
llvm-svn: 272763
2016-06-15 07:30:38 +00:00
Craig Topper
34d9707825
[AVX512] Use AND32ri8 instead of AND32ri when anding with 1 to create single bit masks. This results in a smaller encoding.
...
llvm-svn: 272627
2016-06-14 03:13:03 +00:00
Craig Topper
99e30e6a66
[AVX512] Use MOVZX32 instead of MOVZ16 for loading single v8/v4/v2/v1 masks when KMOVB is not available. This has better behavior with respect to partial register stalls since it won't need to preserve the upper 16-bits of the GPR.
...
llvm-svn: 272626
2016-06-14 03:13:00 +00:00
Craig Topper
ddab395397
[AVX512] Add patterns for zero-extending a mask that use the def of KMOVW/KMOVB without going through an EXTRACT_SUBREG and a MOVZX.
...
llvm-svn: 272625
2016-06-14 03:12:54 +00:00
Simon Pilgrim
b13961d25b
Strip trailing whitespace. NFCI.
...
llvm-svn: 272476
2016-06-11 14:34:10 +00:00
Simon Pilgrim
255fdd0666
[X86][SSE] Use vXi8 return type for PSLLDQ/PSRLDQ instructions
...
These are byte shift instructions and it will make shuffle combining a lot more straightforward if we can assume a vXi8 vector of bytes so decoded shuffle masks match the return type's number of elements
llvm-svn: 272468
2016-06-11 12:54:37 +00:00
Craig Topper
7a2993093e
[X86] Bring consistent naming to the SSE/AVX and AVX512 PALIGNR instructions. Then add shuffle decode printing for the EVEX forms which is made easier by having the naming structure more similar to other instructions.
...
llvm-svn: 272249
2016-06-09 07:06:38 +00:00
Igor Breger
982e4003a6
[AVX512] Fix cvtusi2sd instruction Opcode, it should be 0x7B instead of 0x2A.
...
llvm-svn: 272122
2016-06-08 07:48:23 +00:00
Simon Pilgrim
9a89623b57
[X86][SSE] Add general lowering of nontemporal vector loads
...
Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins.
This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch.
We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction.
There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets.
Differential Review: http://reviews.llvm.org/D20965
llvm-svn: 272010
2016-06-07 13:34:24 +00:00
Craig Topper
2f90c1fedf
[AVX512] Allow avx2 and sse41 nontemporal load intrinsics to select EVEX encoded instructions when VLX is enabled.
...
llvm-svn: 271988
2016-06-07 07:27:57 +00:00
Craig Topper
e1cac15feb
[AVX512] Remove unnecessary mayLoad, mayStore, hasSidEffects flags from instructions that have patterns that imply them. Add the same set of flags to instructions that don't have patterns to imply them.
...
llvm-svn: 271987
2016-06-07 07:27:54 +00:00
Craig Topper
0fcf925699
[AVX512] Add NoVLX to a couple patterns that have VLX equivalents. Ordering of the patterns in the .td file protects this, but its better to be explicit.
...
llvm-svn: 271986
2016-06-07 07:27:51 +00:00
Craig Topper
2388b4610a
[X86] Remove unnecessary pattern predicates from the vector bit cast patterns. The types have to be legal and there are no alternative patterns. Saves almost 200 bytes in isel table.
...
llvm-svn: 271625
2016-06-03 04:15:27 +00:00
Craig Topper
19462f02bb
[X86] Cleanup formatting a bit to align similar parts of adjacent lines.
...
llvm-svn: 271624
2016-06-03 04:15:25 +00:00
Craig Topper
895897f85b
[X86] Remove redundant bitcast patterns for 128/256-bit vectors. These only differ from the SSE/AVX versions by the register class, but register class has no bearing on isel.
...
llvm-svn: 271623
2016-06-03 04:15:22 +00:00
Igor Breger
73ee8ba9b0
[AVX512] Fix intrinsic vcvtps2ph lowering.
...
Differential Revision: http://reviews.llvm.org/D20788
llvm-svn: 271255
2016-05-31 08:04:21 +00:00
Igor Breger
52bd1d5fcc
Fix intrinsic vbroadcast{i32|f32}x2 lowering.
...
Differential Revision: http://reviews.llvm.org/D20780
llvm-svn: 271254
2016-05-31 07:43:39 +00:00
Craig Topper
95bdabd338
[AVX512] Add patterns to implement stores of extracts of least signficant subvectors using XMM or YMM stores instead of the vector extract instructions.
...
Similar is already done for AVX and we had lost it going to AVX512VL.
llvm-svn: 270383
2016-05-22 23:44:33 +00:00
Igor Breger
2ba64ab9ae
[AVX512] Implement missing patterns for any_extend load lowering.
...
Differential Revision: http://reviews.llvm.org/D20513
llvm-svn: 270357
2016-05-22 10:21:04 +00:00
Craig Topper
5f3fef884f
[AVX512] The AVX512 file only need subtract_subvector index 0 patterns where the source is 512-bits. The 256-bit source patterns were redundant with AVX.
...
llvm-svn: 270356
2016-05-22 07:40:58 +00:00
Craig Topper
a1041ff001
[AVX512] Add an AddedComplexity line to the 512-bit insert_subvector undef index 0 patterns. This gives them higher priority than the memory patterns. This matches AVX1/2.
...
llvm-svn: 270355
2016-05-22 07:40:40 +00:00
Craig Topper
de5498546e
[AVX512] Change the AddedComplexity on some patterns to match their AVX/SSE equivalents. This helps group them close together in the isel tables and enable table compression.
...
llvm-svn: 270354
2016-05-22 06:09:34 +00:00
Craig Topper
33c550cb95
[AVX512] Add a couple patterns to fix some cases where two vector mask inversions could appear in a row.
...
llvm-svn: 270344
2016-05-22 00:39:30 +00:00
Craig Topper
dbac1ff9c1
[AVX512] Remove seemingly unnecessary AddedComplexity adjustment.
...
llvm-svn: 270343
2016-05-22 00:39:27 +00:00
Craig Topper
db960eddfa
[AVX512] Add patterns for extracting subvectors and storing to memory.
...
llvm-svn: 270334
2016-05-21 22:50:14 +00:00
Craig Topper
03b849eb44
[AVX512] Capitalize the Z in VEXTRACTPSzmr. Lowercase z has been primarily used to indicating the zero masking behavior which is not the case here. NFC
...
llvm-svn: 270333
2016-05-21 22:50:11 +00:00
Craig Topper
d5da6a39f2
[AVX512] Rename vector extract instructions so 'mr' intead of 'rm' to reflect the fact that memory is the destination.
...
llvm-svn: 270332
2016-05-21 22:50:09 +00:00
Craig Topper
08a6857c82
[AVX512] Fix copy/paste mistake a I made in a comment.
...
llvm-svn: 270331
2016-05-21 22:50:04 +00:00
Michael Zuckerman
11b55b29d1
[Clang][AVX512][intrinsics] Fix vscalef intrinsics.
...
Differential Revision: http://reviews.llvm.org/D20324
llvm-svn: 270321
2016-05-21 11:09:53 +00:00
Craig Topper
02626c076b
[AVX512] Add patterns for VEXTRACT v16i16->v8i16 and v32i8->v16i8. Disable AVX2 versions of vector extract when AVX512VL is enabled.
...
llvm-svn: 270318
2016-05-21 07:08:56 +00:00
Craig Topper
19e04b6430
[X86] Generalize and combine some similar type constraints and node types. No changes to the isel table size so the separation wasn't buying us anything.
...
llvm-svn: 270026
2016-05-19 06:13:58 +00:00
Craig Topper
74ed087b0b
[AVX512] Strengthen type checks on the X86ISD::SELECT node. Saves over 800 bytes in the DAG isel table by removing type checks for the condition operand which is always a vector or scalar of i1 matching the the number of elements in the other operands.
...
llvm-svn: 269885
2016-05-18 06:55:59 +00:00
Craig Topper
a58abd1cc6
[AVX512] Fix up types for arguments of int_x86_avx512_mask_cvtsd2ss_round and int_x86_avx512_mask_cvtss2sd_round. Only the argument being converted should be a different type. The other 2 argument should have the same type as the result.
...
llvm-svn: 268891
2016-05-09 05:34:12 +00:00
Craig Topper
707c89c00d
[AVX512] Add non-temporal store patterns for v16i32/v32i16/v64i8.
...
llvm-svn: 268889
2016-05-08 23:43:17 +00:00
Craig Topper
c41320d700
[AVX512] Add missing patterns for non-temporal stores of 128/256-bit vXi8/vXi16/vXi32 when VLX is enabled. The equivalent AVX1/2 patterns are disabled by VLX.
...
This caused regular stores to be emitted instead.
llvm-svn: 268886
2016-05-08 23:08:45 +00:00
Craig Topper
e5ce84a33c
[AVX512] Add VLX 128/256-bit SET0 operations that encode to 128/256-bit EVEX encoded VPXORD so all 32 registers can be used.
...
llvm-svn: 268884
2016-05-08 21:33:53 +00:00
Craig Topper
9d9251b86f
[X86] Remove extra patterns that check for BUILD_VECTOR of all 0s. These are always canonicalized to v4i32/v8i32/v16i32 except for in SSE1 only when only v4f32 is supported.
...
llvm-svn: 268880
2016-05-08 20:10:20 +00:00
Igor Breger
58c07806ae
[AVX512] Add support for commutative MAX/MIN . In general VMAX{PS,PD} and VMIN{PS,PD} instruction are not commutative . In combine pass only if UnsafeFPMath are used VMAX/VMAX are converted to commutative nodes VMAXC/VMAXC.
...
Differential Revision: http://reviews.llvm.org/D19860
llvm-svn: 268375
2016-05-03 11:51:45 +00:00
Craig Topper
b6da65403a
[AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While there fix the execution domain for VPACKSSDW/VPACKUSDW.
...
llvm-svn: 268200
2016-05-01 17:38:32 +00:00
Igor Breger
131008fbcb
Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth.
...
Differential Revision: http://reviews.llvm.org/D19579
llvm-svn: 268190
2016-05-01 08:40:00 +00:00
Craig Topper
5acb5a1caf
[AVX512] Add HasVLX to the 128/256-bit versions of VPACKSSDW/USDW/SSWB/USWB and VPMADDUBSW/VPMADDWD.
...
llvm-svn: 268188
2016-05-01 06:24:57 +00:00
Craig Topper
db290664f6
[AVX512] Make sure 128/256-bit DQI versions of VAND/VANDN/VOR/VXOR are also marked as requiring VLX.
...
llvm-svn: 268186
2016-05-01 05:57:06 +00:00
Craig Topper
7ed84d826e
[X86] Remove some redundant selection patterns.
...
llvm-svn: 268180
2016-05-01 04:59:46 +00:00
Craig Topper
c9b1923358
[AVX512] Replace vector_extract with extractelt in some patterns. They mean the same thing but vector_extract is deprecated. NFC
...
llvm-svn: 268179
2016-05-01 04:59:44 +00:00
Craig Topper
99f6b620cc
[AVX512] Add hasSideEffects/mayLoad/mayStore flags to some instructions.
...
llvm-svn: 268174
2016-05-01 01:03:56 +00:00
Elena Demikhovsky
5e426f7356
AVX-512: Load and Extended Load for i1 vectors
...
Implemented load+{sign|zero}_extend for i1 vectors
Fixed failures in i1 vector load.
Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX.
Differential Revision: http://reviews.llvm.org/D18737
llvm-svn: 265259
2016-04-03 08:41:12 +00:00
Elena Demikhovsky
95629caaa9
AVX-512: fixed a bug in fp_to_uint pattern on KNL
...
Fixed fp_to_uint instruction selection on KNL.
One pattern was missing for <4 x double> to <4 x i32>
Differential Revision: http://reviews.llvm.org/D18512
llvm-svn: 264701
2016-03-29 06:33:41 +00:00