Commit Graph

446 Commits

Author SHA1 Message Date
Asaf Badouh d2c3599c5f [X86][AVX512VLBW] add support in byte shift and SAD
add byte shift left/right
add SAD - compute sum of absolute differences

Differential Revision: http://reviews.llvm.org/D12479

llvm-svn: 246654
2015-09-02 14:21:54 +00:00
Igor Breger 1e58e8adf6 AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11593

llvm-svn: 246642
2015-09-02 11:18:55 +00:00
Igor Breger a6297c701e AVX512: Implemented encoding and intrinsics for vshufps/d.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11709

llvm-svn: 246640
2015-09-02 10:50:58 +00:00
Elena Demikhovsky 9f83c7346f AVX-512: store <4 x i1> and <2 x i1> values in memory
Enabled DAG pattern lowering for SKX with DQI predicate.

Differential Revision: http://reviews.llvm.org/D12550

llvm-svn: 246625
2015-09-02 09:20:58 +00:00
Igor Breger 5ea0a68115 AVX512: ktest implemantation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11979

llvm-svn: 246439
2015-08-31 13:30:19 +00:00
Igor Breger f3ded811b2 AVX512: Implemented encoding and intrinsics for vdbpsadbw
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12491

llvm-svn: 246436
2015-08-31 13:09:30 +00:00
Igor Breger 59ac339357 AVX512: kadd implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11973

llvm-svn: 246432
2015-08-31 11:50:23 +00:00
Igor Breger 2ae0fe3ac3 AVX512: Implemented encoding and intrinsics for vpalignr
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12270

llvm-svn: 246428
2015-08-31 11:14:02 +00:00
Marina Yatsina 7a4e1ba737 [X86] Fix bug in COMISD and COMISS definition in td files
COMISD should receive QWORD because it is defined as
 (V)COMISD xmm1, xmm2/m64

COMISS should receive DWORD because it is defined as
 (V)COMISS xmm1, xmm2/m32

Differential Revision: http://reviews.llvm.org/D11712

llvm-svn: 245551
2015-08-20 11:21:36 +00:00
Michael Liao 66233b7d79 Removing tailing whitespaces
llvm-svn: 244203
2015-08-06 09:06:20 +00:00
Igor Breger 8352a0ddf2 AVX512: Implemented encoding and intrinsics for VGETEXPSS/D instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11528

llvm-svn: 243390
2015-07-28 06:53:28 +00:00
Igor Breger f2460112ad Implemented encoding and intrinsics of the following instructions
vunpckhps/pd, vunpcklps/pd, 
  vpunpcklbw, vpunpckhbw, vpunpcklwd, vpunpckhwd, vpunpckldq, vpunpckhdq, vpunpcklqdq, vpunpckhqdq
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11509

llvm-svn: 243246
2015-07-26 14:41:44 +00:00
Igor Breger 074a64e72c AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 243122
2015-07-24 17:24:15 +00:00
Chandler Carruth fe414353db Revert r242990: "AVX-512: Implemented encoding , DAG lowering and ..."
This commit broke the build. Numerous build bots broken, and it was
blocking my progress so reverting.

It should be trivial to reproduce -- enable the BPF backend and it
should fail when running llvm-tblgen.

llvm-svn: 242992
2015-07-23 08:03:44 +00:00
Igor Breger da1b2ea955 AVX-512: Implemented encoding , DAG lowering and intrinsics for Integer Truncate with/without saturation
Added tests for DAG lowering ,encoding and intrinsic

Differential Revision: http://reviews.llvm.org/D11218

llvm-svn: 242990
2015-07-23 07:39:21 +00:00
Asaf Badouh a5b2e5e2a7 [X86][AVX512] add reduce/range/scalef/rndScale
include encoding and intrinsics

Differential Revision: http://reviews.llvm.org/D11222

llvm-svn: 242896
2015-07-22 12:00:43 +00:00
Igor Breger f7fd547e27 AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction ,
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11351

llvm-svn: 242761
2015-07-21 07:11:28 +00:00
Elena Demikhovsky 0f370936a0 AVX-512: Added all AVX-512 forms of Vector Convert for Float/Double/Int/Long types.
In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch.
I temporary removed the old intrinsics test (just to split this patch).
Half types are not covered here.

Differential Revision: http://reviews.llvm.org/D11134

llvm-svn: 242023
2015-07-13 13:26:20 +00:00
Simon Pilgrim 8b756596fc [X86][SSE] Use the general SMAX/SMIN/UMAX/UMIN opcodes and remove the X86 implementation
With the completion of D9746 there is now a common implementation of integer signed/unsigned min/max nodes, removing the need for the equivalent X86 specific implementations.

This patch removes the old X86ISD nodes, legalizes the relevant SSE2/SSE41/AVX2/AVX512 instructions for the ISD versions and converts the small amount of existing X86 code.

Differential Revision: http://reviews.llvm.org/D10947

llvm-svn: 241506
2015-07-06 20:30:47 +00:00
Asaf Badouh c6f3c82ffc [X86][AVX512] Multiply Packed Unsigned Integers with Round and Scale
pmulhrsw

review:
http://reviews.llvm.org/D10948

llvm-svn: 241443
2015-07-06 14:03:40 +00:00
Asaf Badouh 73f26f8ffc [x86][AVX512] add Multiply High Op
include encoding and intrinsics tests.

review
http://reviews.llvm.org/D10896

llvm-svn: 241406
2015-07-05 12:23:20 +00:00
Igor Breger 15820b072b AVX-512: Implemented missing encoding for FMA scalar instructions
Added tests for encoding

Differential Revision: http://reviews.llvm.org/D10865

llvm-svn: 241159
2015-07-01 13:24:28 +00:00
Elena Demikhovsky 30bc4ca313 AVX-512: all forms of SCATTER instruction on SKX,
encoding, intrinsics and tests.

llvm-svn: 240936
2015-06-29 12:14:24 +00:00
Igor Breger a7a8e9a018 AVX-512: Implemented missing encoding and intrinsics for FMA instructions
Added tests for DAG lowering ,encoding and intrinsics

Differential Revision: http://reviews.llvm.org/D10796

llvm-svn: 240926
2015-06-29 09:10:00 +00:00
Asaf Badouh 7ec4b7a8bb [x86][AVX512]
Add vscalef support
include encoding and intrinsics


review:
http://reviews.llvm.org/D10730

llvm-svn: 240906
2015-06-28 14:30:39 +00:00
Elena Demikhovsky 6a1a357f1f AVX-512: Added all SKX forms of GATHER instructions.
Added intrinsics.
Added encoding and tests.

llvm-svn: 240905
2015-06-28 10:53:29 +00:00
Elena Demikhovsky 5e2f8c4231 AVX-512: Added all forms of VPABS instruction
Added all intrinsics, tests for encoding, tests for intrinsics.

llvm-svn: 240386
2015-06-23 08:19:46 +00:00
Elena Demikhovsky 55a997437c AVX-512: added VPSHUFB instruction - all SKX forms
Added intrinsics and encoding tests.

llvm-svn: 240277
2015-06-22 13:00:42 +00:00
Elena Demikhovsky ba5ab328e5 AVX-512: All forms of VCOPMRESS VEXPAND instructions,
encoding tests.

llvm-svn: 240272
2015-06-22 11:16:30 +00:00
Asaf Badouh 81f03c30a5 [AVX512]
add instructions: VPAVGB and VPAVGW


review
http://reviews.llvm.org/D10504

llvm-svn: 240012
2015-06-18 12:30:53 +00:00
Elena Demikhovsky d3057e5e37 AVX-512: (fixed) Added encoding of all forms of VPERMT2W/D/Q/PS/PD and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.

llvm-svn: 240003
2015-06-18 08:56:19 +00:00
Elena Demikhovsky 4f13f3f9b8 reverted 239999 due to test failures
llvm-svn: 240001
2015-06-18 08:06:49 +00:00
Elena Demikhovsky 975a637cd9 AVX-512: Added encoding of all forms of VPERMT2W/D/Q/PS/PD
and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.

llvm-svn: 239999
2015-06-18 07:29:40 +00:00
Igor Breger dfcc3d31a7 AVX-512: cvtusi2ss/d intrinsics.
Change builtin function name and signature ( add third parameter - rounding mode ).
Added tests for intrinsics.

Differential Revision: http://reviews.llvm.org/D10473

llvm-svn: 239888
2015-06-17 07:23:57 +00:00
Asaf Badouh 02d126cb9d [AVX512] add integer min/max intrinsics support.
review:
http://reviews.llvm.org/D10439

llvm-svn: 239806
2015-06-16 08:39:27 +00:00
Igor Breger abe4a79b75 AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control for KNL.
Added intrinsics for cvtsi2ss/d instructions.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D10430

llvm-svn: 239694
2015-06-14 12:44:55 +00:00
Elena Demikhovsky 00c9ad5ec2 AVX-512: Fixed a bug in comparison of i1 vectors.
cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631

llvm-svn: 239460
2015-06-10 06:49:28 +00:00
Igor Breger 00d9f8457b AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

llvm-svn: 239300
2015-06-08 14:03:17 +00:00
Elena Demikhovsky 4078c75bd4 AVX-512: added all SKX forms of VPERMW/D/Q instructions.
Added all forms of VPERMPS/PD instrcuctions.
Added encoding tests.

llvm-svn: 239016
2015-06-04 07:07:13 +00:00
Asaf Badouh 402ebb34af re-apply 238809
AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.
CR:
http://reviews.llvm.org/D9991

llvm-svn: 238923
2015-06-03 13:41:48 +00:00
Elena Demikhovsky 9e38086534 AVX-512: Implemented SHUFF32x4/SHUFF64x2/SHUFI32x4/SHUFI64x2 instructions for SKX and KNL.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238917
2015-06-03 10:56:40 +00:00
Elena Demikhovsky 8938f5acca AVX-512: Implemented VRANGESD and VRANGESS instructions for SKX Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238834
2015-06-02 14:12:54 +00:00
Elena Demikhovsky 3425c932da AVX-512: Implemented VFIXUPIMMSD and VFIXUPIMMSS instructions for KNL
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238811
2015-06-02 08:28:57 +00:00
Asaf Badouh 8d897dd05f revert 238809
llvm-svn: 238810
2015-06-02 07:45:19 +00:00
Asaf Badouh 17de10f37e AVX-512: Implemented GETEXP instruction for KNL and SKX
Added rounding mode modifier for SQRTPS/PD
Added tests for encoding and intrinsics.

llvm-svn: 238809
2015-06-02 07:18:14 +00:00
Elena Demikhovsky 3582eb3b39 AVX-512: Implemented VRANGEPD and VRANGEPD instructions for SKX.
Implemented DAG lowering for all these forms.
Added tests for encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238738
2015-06-01 11:05:34 +00:00
Elena Demikhovsky 75ede68793 AVX-512: added all forms of VPSHUFD and VPSHUFHW, VPSHUFLW
including encodings.

llvm-svn: 238729
2015-06-01 07:17:23 +00:00
Elena Demikhovsky 42c96d9c0a AVX-512: Implemented VFIXUPIMMPD and VFIXUPIMMPS instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for encoding.

by Igor Breger (igor.breger@intel.com)

llvm-svn: 238728
2015-06-01 06:50:49 +00:00
Elena Demikhovsky 86c7b46680 AVX-512: Fixed a bug in extracting subvector from v64i1
By Igor Breger (igor.breger@intel.com)

llvm-svn: 238322
2015-05-27 14:09:33 +00:00
Elena Demikhovsky 3948c590e3 AVX-512: Implemented all forms of sign-extend and zero-extend instructions for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 238301
2015-05-27 08:15:19 +00:00
Elena Demikhovsky f61727d880 AVX-512: fixed algorithm of building vectors of i1 elements
fixed extract-insert i1 element,
load i1, zextload i1 should be with "and $1, %reg" to prevent loading garbage.
added a bunch of new tests.

llvm-svn: 237793
2015-05-20 14:32:03 +00:00
Elena Demikhovsky 08ce53c0ea AVX-512: Added patterns for scalar-to-vector broadcast
llvm-svn: 237558
2015-05-18 07:06:23 +00:00
Elena Demikhovsky ad9c396838 AVX-512: Added VBROADCASTF64X4, VBROADCASTF64X2, VBROADCASTI32X8, and other instructions from this set
Added encoding tests.

llvm-svn: 237557
2015-05-18 06:42:57 +00:00
Elena Demikhovsky 1d6a495d6d AVX-512: fixed a bug in mask operations - (i1 1) pattern
Filling k-reg with all-ones value was wrong,
(i1 1) should switch on only one bit in mask register

llvm-svn: 237536
2015-05-17 07:28:51 +00:00
Elena Demikhovsky 1b2f2f1b37 AVX-512: fixed a bug in encoding of VPSRAQ instrcution,
added a bunch of encoding tests.

llvm-svn: 237232
2015-05-13 07:35:05 +00:00
Elena Demikhovsky 0d7e9364d1 AVX-512: Added SKX instructions and intrinsics:
{add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae

By Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236971
2015-05-11 06:05:05 +00:00
Elena Demikhovsky 75d1489326 AVX-512: fixed a bug in i1 vectors lowering
llvm-svn: 236947
2015-05-10 10:33:32 +00:00
Elena Demikhovsky 29792e9a80 AVX-512: Added all forms of FP compare instructions for KNL and SKX.
Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec.

By Igor Breger (igor.breger@intel.com)

llvm-svn: 236714
2015-05-07 11:24:42 +00:00
Elena Demikhovsky 60eb9db7bb AVX-512: added calling convention for i1 vectors in 32-bit mode.
Fixed some bugs in extend/truncate for AVX-512 target.
Removed VBROADCASTM (masked broadcast) node, since it is not used any more.

llvm-svn: 236420
2015-05-04 12:40:50 +00:00
Elena Demikhovsky 52266388f8 AVX-512: added integer "add" and "sub" instructions with saturation for SKX
with intrinsics and tests

by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236418
2015-05-04 12:35:55 +00:00
Elena Demikhovsky 2557a22be7 AVX-512: Added VPACK* instructions forms for KNL and SKX
and their intrinsics
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 236414
2015-05-04 09:14:02 +00:00
Elena Demikhovsky e1eda8a9e6 Masked gather and scatter - added DAGCombine visitors
and AVX-512 instruction selection patterns.
All other patches, including tests will follow.

http://reviews.llvm.org/D7665

llvm-svn: 236211
2015-04-30 08:38:48 +00:00
Elena Demikhovsky d1084c5b3f AVX-512: Extend/Truncate operations for SKX,
SETCC for bit-vectors

llvm-svn: 235875
2015-04-27 12:57:59 +00:00
Elena Demikhovsky 0e6d6d54ce AVX-512: Added VPMOVx2M instructions for SKX,
fixed encoding of VPMOVM2x.

llvm-svn: 235385
2015-04-21 14:38:31 +00:00
Elena Demikhovsky 431b81e41f AVX-512: Added VPTESTM and VPTESTNM instructions for SKX
llvm-svn: 235383
2015-04-21 13:13:46 +00:00
Elena Demikhovsky 50b88ddb87 AVX-512: Added logical and arithmetic instructions for SKX
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 235375
2015-04-21 10:27:40 +00:00
Elena Demikhovsky 1eeece1285 AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB
by Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 233906
2015-04-02 10:51:40 +00:00
Elena Demikhovsky d8fda62247 AVX-512: blank lines, duplicated tests, no functional changes
see comments http://reviews.llvm.org/D6835

llvm-svn: 233528
2015-03-30 09:29:28 +00:00
Elena Demikhovsky 72e3ccc375 AVX-512: Fixed the "commutative" property flag in VPANDN instruction
By Asaf Badouh (asaf.badouh@intel.com)

llvm-svn: 233489
2015-03-29 09:14:29 +00:00
Elena Demikhovsky 5d06b4c80c AVX-512: Added encoding tests for VPROR, VPROL instructions,
fixed opcode.

llvm-svn: 232018
2015-03-12 07:28:41 +00:00
Elena Demikhovsky 0b9dbe33aa AVX-512: Added SKX forms of shift instructions.
Added rotation instructions, encoding only.
Added encoding tests for all these forms.

llvm-svn: 231916
2015-03-11 10:25:42 +00:00
Elena Demikhovsky de05f10de2 AVX-512, SKX: Enabled masked_load/store operations for this target.
Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors,
it is needed to pass all masked_memop.ll tests for SKX.

llvm-svn: 231371
2015-03-05 15:11:35 +00:00
Elena Demikhovsky d207f17fa1 AVX-512: Moved patterns for masked load/store under avx_store, avx_load classes.
No functional changes.

llvm-svn: 231069
2015-03-03 15:03:35 +00:00
Elena Demikhovsky 2689d78909 AVX-512: Simplified MOV patterns, no functional changes.
llvm-svn: 230954
2015-03-02 12:46:21 +00:00
Craig Topper 09b27e7b24 [X86] Fix diassembler crash on AVX512 cmpps/cmppd with immediate that doesn't fit in 5-bits. Fixes PR22743.
llvm-svn: 230924
2015-03-02 00:22:29 +00:00
Elena Demikhovsky 0995479e67 Reverted 230471 - gather scatter handling in table gen.
llvm-svn: 230892
2015-03-01 08:23:41 +00:00
Elena Demikhovsky 02ffd26023 AVX-512: Added mask and rounding mode for scalar arithmetics
Added more tests for scalar instructions to destinguish between AVX and AVX-512 forms.

llvm-svn: 230891
2015-03-01 07:44:04 +00:00
Elena Demikhovsky 56eadcf5ce AVX-512: Gather and Scatter patterns
Gather and scatter instructions additionally write to one of the source operands - mask register.
In this case Gather has 2 destination values - the loaded value and the mask.
Till now we did not support code gen pattern for gather - the instruction was generated from 
intrinsic only and machine node was hardcoded.
When we introduce the masked_gather node, we need to select instruction automatically,
in the standard way.
I added a flag "hasTwoExplicitDefs" that allows to handle 2 destination operands.

(Some code in the X86InstrFragmentsSIMD.td is commented out, just to split one big
patch in many small patches)

llvm-svn: 230471
2015-02-25 09:46:31 +00:00
Elena Demikhovsky 52e81bc499 AVX-512: recommitted 229837 + bugfix + test
llvm-svn: 230223
2015-02-23 15:12:31 +00:00
Eric Christopher 0d94fa98e5 Revert "AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics."
The instructions were being generated on architectures that don't support avx512.

This reverts commit r229837.

llvm-svn: 229942
2015-02-20 00:45:28 +00:00
Eric Christopher 06b32cdfed Add a license header to the AVX512 file.
llvm-svn: 229941
2015-02-20 00:36:53 +00:00
Elena Demikhovsky 69e8b45b13 AVX-512: Full implementation for VRNDSCALESS/SD instructions and intrinsics.
llvm-svn: 229837
2015-02-19 10:48:04 +00:00
Elena Demikhovsky 714f23bcdb AVX-512: Added support for FP instructions with embedded rounding mode.
By Asaf Badouh <asaf.badouh@intel.com>

llvm-svn: 229645
2015-02-18 07:59:20 +00:00
Elena Demikhovsky ba84672519 AVX-512: changes in intel_ocl_bi calling conventions
- added mask types v8i1 and v16i1 to possible function parameters
- enabled passing 512-bit vectors in standard CC
- added a test for KNL intel_ocl_bi conventions

llvm-svn: 229482
2015-02-17 09:20:12 +00:00
Elena Demikhovsky d2cb3c8876 AVX-512: Fixed the "test" operation for i1 type
Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits.
KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits.
I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB.

There are some cases where i1 is in the mask register and the upper bits are already zeroed.
Then KORTESTW is the better solution, but it is subject for optimization.
Meanwhile, I'm fixing the correctness issue.

llvm-svn: 228916
2015-02-12 08:40:34 +00:00
Craig Topper 820d49270d [X86] Remove 'memop' uses from AVX512. Use 'load' instead.
llvm-svn: 228562
2015-02-09 04:04:50 +00:00
Elena Demikhovsky 7b0dd39db6 AVX-512: Added FMA intrinsics with rounding mode
By Asaf Badouh and Elena Demikhovsky

Added special nodes for rounding: FMADD_RND, FMSUB_RND..
It will prevent merge between nodes with rounding and other standard nodes.

llvm-svn: 227303
2015-01-28 10:21:27 +00:00
Craig Topper 7d3c6d307a [X86] Teach disassembler to handle illegal immediates on AVX512 integer compare instructions.
llvm-svn: 227302
2015-01-28 10:09:56 +00:00
Elena Demikhovsky 1a603b3f13 AVX-512: Changes in operations on masks registers for KNL and SKX
- Added KSHIFTB/D/Q for skx
- Added KORTESTB/D/Q for skx
- Fixed store operation for v8i1 type for KNL
- Store size of v8i1, v4i1 and v2i1 are changed to 8 bits

llvm-svn: 227043
2015-01-25 12:47:15 +00:00
Craig Topper ca8e179bc2 [X86] Give scalar VRNDSCALE instructions priority in AVX512 mode.
llvm-svn: 227039
2015-01-25 08:49:22 +00:00
Craig Topper 1d60952401 Simplify a multiclass. No functional change.
llvm-svn: 227038
2015-01-25 08:49:19 +00:00
Craig Topper 53a846764c [X86] Replace i32i8imm on SSE/AVX instructions with i32u8imm which will make the assembler bounds check them. It will also make them print as unsigned.
llvm-svn: 227032
2015-01-25 02:21:16 +00:00
Craig Topper fc946a0e6f [X86] Use u8imm in several places that used i32i8imm that don't require an i32 type.
llvm-svn: 227031
2015-01-25 02:21:13 +00:00
Craig Topper 46469aa4da [X86] Add IntrNoMem to the AVX512 conflict intrinsics.
llvm-svn: 226897
2015-01-23 06:11:45 +00:00
Craig Topper e0c8e8f6a7 Revert r226798. Guess I missed the patterns.
llvm-svn: 226802
2015-01-22 09:01:20 +00:00
Craig Topper ffef4cf1e1 Use u8imm instead of i32i8imm on a couple instructions that have no patterns and thus no reason to use a larger operand size.
llvm-svn: 226798
2015-01-22 08:53:11 +00:00
Craig Topper 9b39e54001 [X86] Remove some unused multiclasses from AVX512 instruction file.
llvm-svn: 226797
2015-01-22 08:53:08 +00:00
Craig Topper 7ff6ab30a9 [X86] Convert all the i8imm used by AVX512 and MMX instructions to u8imm.
llvm-svn: 226646
2015-01-21 08:43:49 +00:00
Craig Topper f38dea1cfa [x86] Add assembly parser bounds checking to the immediate value for cmpss/cmpsd/cmpps/cmppd.
llvm-svn: 226642
2015-01-21 06:07:53 +00:00
Craig Topper 9f4d485610 [x86] Add some mayLoad/hasSideEffects flags. Remove one that was already covered by a pattern.
llvm-svn: 226562
2015-01-20 12:15:30 +00:00
Craig Topper f4bf9119a1 [x86] Change AVX512 intrinsics to take a 8-bit immediate for the comparision kind instead of a 32-bit immediate. This better aligns with the emitted instruction. It also matches SSE and AVX1 equivalents. Also add auto upgrade support.
llvm-svn: 226430
2015-01-19 06:07:27 +00:00
Adam Nemet 3e8b22bc1b [AVX512] Add intrinsics for masked aligned FP loads and stores
Similar to the unaligned cases.

Test was generated with update_llc_test_checks.py.

Part of <rdar://problem/17688758>

llvm-svn: 226296
2015-01-16 18:50:09 +00:00
Craig Topper 9fdd078afb Hide some redundant AVX512 instructions from the asm parser, but force them to show up in the disassembler.
llvm-svn: 226155
2015-01-15 09:37:15 +00:00
Craig Topper 6e3a582809 [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics with illegal immediates. Correctly this time. I did the wrong patterns the first time.
llvm-svn: 224891
2014-12-27 20:08:45 +00:00
Craig Topper 1113fb343e [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics with illegal immediates. Forgot to do this when I did SSE/SSE2/AVX/AVX2.
llvm-svn: 224887
2014-12-27 18:51:06 +00:00
Elena Demikhovsky fcea06acb5 AVX-512: Added FMA instructions, intrinsics an tests for KNL and SKX targets
by Asaf Badouh

http://reviews.llvm.org/D6456

llvm-svn: 224764
2014-12-23 10:30:39 +00:00
Elena Demikhovsky 3121449f0b AVX-512: BLENDM - fixed encoding of the broadcast version
Added more intrinsics and encoding tests.

llvm-svn: 224760
2014-12-23 09:36:28 +00:00
Elena Demikhovsky 949b0d46bf AVX-512: Added all forms of BLENDM instructions,
intrinsics, encoding tests for AVX-512F and skx instructions.

llvm-svn: 224707
2014-12-22 13:52:48 +00:00
Elena Demikhovsky fb73ca516b Masked load and store codegen - fixed 128-bit vectors
The codegen failed on 128-bit types on AVX2.
I added patterns and in td files and tests.

llvm-svn: 224647
2014-12-19 23:27:57 +00:00
Robert Khasanov 8d9b93eac8 [AVX512] Add a comment for avx512_broadcast_pat multiclass
llvm-svn: 224341
2014-12-16 16:12:11 +00:00
Elena Demikhovsky 72860c341e AVX-512: Added EXPAND instructions and intrinsics.
llvm-svn: 224241
2014-12-15 10:03:52 +00:00
Robert Khasanov 4204c1acc6 [AVX512] Minor fix in lowering pattern for broadcast intrustions.
No functional change.

llvm-svn: 224122
2014-12-12 14:21:30 +00:00
Cameron McInally 5fb084e798 [AVX512] Add support for 512b variable bit shift intrinsics.
llvm-svn: 224028
2014-12-11 17:13:05 +00:00
Elena Demikhovsky 908dbf48c8 AVX-512: Added all forms of COMPRESS instruction
+ intrinsics + tests

llvm-svn: 224019
2014-12-11 15:02:24 +00:00
Robert Khasanov 8e8c39963d [AVX512] Added lowering for VBROADCASTSS/SD instructions.
Lowering patterns were written through avx512_broadcast_pat multiclass as pattern generates VBROADCAST and COPY_TO_REGCLASS nodes.
Added lowering tests.

llvm-svn: 223804
2014-12-09 18:45:30 +00:00
Robert Khasanov cbc5703aeb [AVX512] Added VPBROADCAST{BWDQ} (Load with Broadcast Integer Data from General Purpose Register) encodings for AVX512-BW/VL subsets
Added encoding tests.
        

llvm-svn: 223787
2014-12-09 16:38:41 +00:00
Elena Demikhovsky fa4a6c18f7 AVX-512: Added some comments to ERI scalar intrinsics.
No functional change.

llvm-svn: 223761
2014-12-09 07:06:32 +00:00
Elena Demikhovsky f1de34b84d Masked Load / Store Intrinsics - the CodeGen part.
I'm recommiting the codegen part of the patch.
The vectorizer part will be send to review again.

Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191

llvm-svn: 223348
2014-12-04 09:40:44 +00:00
Michael Liao 5bf9578ce4 [X86] Clean up whitespace as well as minor coding style
llvm-svn: 223339
2014-12-04 05:20:33 +00:00
Duncan P. N. Exon Smith 9bc81fbe92 Revert "Masked Vector Load and Store Intrinsics."
This reverts commit r222632 (and follow-up r222636), which caused a host
of LNT failures on an internal bot.  I'll respond to the commit on the
list with a reproduction of one of the failures.

Conflicts:
	lib/Target/X86/X86TargetTransformInfo.cpp

llvm-svn: 222936
2014-11-28 21:29:14 +00:00
Elena Demikhovsky 905a5a606f AVX-512: Scalar ERI intrinsics
including SAE mode and memory operand.
Added AVX512_maskable_scalar template, that should cover all scalar instructions in the future.

The main difference between AVX512_maskable_scalar<> and AVX512_maskable<> is using X86select instead of vselect.
I need it, because I can't create vselect node for MVT::i1 mask for scalar instruction.

http://reviews.llvm.org/D6378

llvm-svn: 222820
2014-11-26 10:46:49 +00:00
Cameron McInally 9b7c15a364 [AVX512] Add 512b integer shift by variable intrinsics and patterns.
llvm-svn: 222786
2014-11-25 20:41:51 +00:00
Craig Topper edb091185d Remove space before tab in all AVX512 mnemonic strings.
llvm-svn: 222778
2014-11-25 20:11:23 +00:00
Elena Demikhovsky 9e5089a938 Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191

llvm-svn: 222632
2014-11-23 08:07:43 +00:00
Craig Topper 949d50bc71 [x86] Remove two redundant isel patterns. They equivalent already exists in the instruction pattern.
llvm-svn: 222094
2014-11-16 09:24:16 +00:00
Cameron McInally 04400449c5 [AVX512] Add 512b masked integer shift by immediate patterns.
llvm-svn: 222002
2014-11-14 15:43:00 +00:00
Elena Demikhovsky be8808dc3f AVX-512: Intrinsics for ERI
3 instructions: vrcp28, vrsqrt28, vexp2, only vector forms.
Intrinsics include SAE (Suppres All Exceptions) parameter.

http://reviews.llvm.org/D6214

llvm-svn: 221774
2014-11-12 07:31:03 +00:00
Robert Khasanov af318f7073 [AVX512] Added VBROADCAST{SS/SD} encoding for VL subset.
Refactored through AVX512_maskable
        

llvm-svn: 220908
2014-10-30 14:21:47 +00:00
Robert Khasanov 595e598869 [AVX512] Implemented AVX512VL FP bnary packed instructions (VADDP*, VSUBP*, VMULP*, VDIVP*, VMAXP*, VMINP*)
Refactored through AVX512_maskable
Added encoding tests for them.

llvm-svn: 220858
2014-10-29 15:43:02 +00:00
Robert Khasanov 1cf354c92f [AVX512] Fix VSQRT packed instructions internal names.
No functional change

llvm-svn: 220808
2014-10-28 18:22:41 +00:00
Robert Khasanov eb12639375 [AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset.
Refactored through AVX512_maskable

llvm-svn: 220806
2014-10-28 18:15:20 +00:00
Robert Khasanov 3e534c93b9 [AVX-512] Expanded rsqrt/rcp instructions to VL subset.
Refactored multiclass through AVX512_maskable

llvm-svn: 220783
2014-10-28 16:37:13 +00:00
Robert Khasanov dd09a8f320 [AVX512] Bring back vector-shuffle lowering support through broadcasts
Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code.

Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll):

 define   <16 x i32> @_inreg16xi32(i32 %a) {
 ; CHECK-LABEL: _inreg16xi32:
 ; CHECK:       ## BB#0:
-; CHECK-NEXT:    vpbroadcastd %edi, %zmm0
+; CHECK-NEXT:    vmovd %edi, %xmm0
+; CHECK-NEXT:    vpbroadcastd %xmm0, %ymm0
+; CHECK-NEXT:    vinserti64x4 $1, %ymm0, %zmm0, %zmm0
 ; CHECK-NEXT:    retq
 %b = insertelement <16 x i32> undef, i32 %a, i32 0
 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer
 ret <16 x i32> %c
}

Here, 256-bit broadcast was generated instead of 512-bit one.

In this patch
1) I added vector-shuffle lowering through broadcasts
2) Removed asserts and branches likes because this is incorrect
-  assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI");
3) Fixed lowering tests

llvm-svn: 220774
2014-10-28 12:28:51 +00:00
Adam Nemet cf7a4a2660 [AVX512] Add vpermil variable version
This is implemented via a multiclass that derives from the vperm imm
multiclass.

Fixes <rdar://problem/18426089>

llvm-svn: 220737
2014-10-27 23:08:40 +00:00
Adam Nemet 8d85b0cd3e [AVX512] Clean up avx512_perm_imm to use X86VectorVTInfo
No functionality change.  No change in X86.td.expanded except that we only set
the CD8 attributes for the memory variants.  (This shouldn't be used unless we
have a memory operand.)

llvm-svn: 220736
2014-10-27 23:08:37 +00:00
Adam Nemet 9aad13164e [AVX512] Derive vpermil* from avx512_perm_imm
This used to derive from avx512_pshuf_imm which is confusing.

NFC.  Compared X86.td.expanded.

llvm-svn: 220735
2014-10-27 23:08:34 +00:00
Adam Nemet c51cee85b7 [AVX512] Fix copy-and-paste bugs in vpermil
1) i512mem -> f512mem (this is the packed FP input being permuted)
2) element size is 64 bits in EVEX_CD8 for PD.

(A good illustration why X86VectorVTInfo is useful)

llvm-svn: 220734
2014-10-27 23:08:31 +00:00
Elena Demikhovsky 4b01b7306c AVX-512: Fixed encoding of VPBROADCASTM and added SKX forms of this instruction
llvm-svn: 220638
2014-10-26 09:52:24 +00:00
Adam Nemet 832ec5e911 [AVX512] FMA support for the 231 variants
This is asm/diasm-only support, similar to AVX.

For ISeling the register variant, they are no different from 213 other than
whether the multiplication or the addition operand is destructed.

For ISeling the memory variant, i.e. to fold a load, they are no different
than the 132 variant.  The addition operand (op3) in both cases can come from
memory.  Again the ony difference is which operand is destructed.

There could be a post-RA pass that would convert a 213 or 132 into a 231.

Part of <rdar://problem/17082571>

llvm-svn: 220540
2014-10-24 00:03:00 +00:00
Adam Nemet 26371ce131 [AVX512] Introduce fma3p_forms from AVX
This multiclass generates the different forms: 213, 231, 132 in AVX.

132 in AVX512 is a separate class but I am planning to use this same
multiclass to generate 231 relying on the nice the null_frag trick from AVX to
disable codegen pattern for 231.

No functionality change, no change in X86.td.expanded except for the different
instruction definition names.

llvm-svn: 220539
2014-10-24 00:02:55 +00:00
Adam Nemet 4285c1f8cc [AVX512] Add DQ subvector inserts
In AVX512f we support 64x2 and 32x8 inserts via matching them to 32x4 and 64x4
respectively.  These are matched by "Alt" Pat<>'s (Alt stands for alternative
VTs).

Since DQ has native support for these intructions, I peeled off the non-"Alt"
part of the baseclass into vinsert_for_size_no_alt. The DQ instructions are
derived from this multiclass.  The "Alt" Pat<>'s are disabled with DQ.

Fixes <rdar://problem/18426089>

llvm-svn: 219874
2014-10-15 23:42:17 +00:00
Adam Nemet 449b3f0931 [AVX512] Two new attributes in X86VectorVTInfo for subvector insert
The new attributes are NumElts and the CD8TupleForm.  This prepares the code
to enable x8 and x2 inserts.

NFC, no change in X86.td.expanded except for the new attributes.

llvm-svn: 219871
2014-10-15 23:42:09 +00:00
Adam Nemet b1c3ef4b60 [AVX512] Rename arg from Opcode32/64 to Opcode128/256 in vinsert_for_size
It's the W bit that selects between 32 or 64 elt type and not the opcode.  The
opcode selects between the width of the insert (128 or 256).

llvm-svn: 219870
2014-10-15 23:42:04 +00:00
Robert Khasanov 1a77f6664e [AVX512] Extended avx512_binop_rm to DQ/VL subsets.
Added encoding tests.

llvm-svn: 219686
2014-10-14 15:13:56 +00:00
Robert Khasanov 545d1b7726 [AVX512] Extended avx512_binop_rm to BW/VL subsets.
Added encoding tests.

llvm-svn: 219685
2014-10-14 14:36:19 +00:00
Robert Khasanov d5b14f7994 [AVX512] Extended avx512_binop_rm for AVX512VL subsets.
Added avx512_binop_rm_vl multiclass for VL subset
Added encoding tests

llvm-svn: 219390
2014-10-09 08:38:48 +00:00
Adam Nemet 3480142ef0 [AVX512] Rename AVX512_masking* to AVX512_maskable*
No functional change.

This is the current AVX512_maskable multiclass hierarchy:

                 maskable_custom
                    /       \
                   /         \
          maskable_common   maskable_in_asm
            /         \
           /           \
      maskable        maskable_3src

llvm-svn: 219363
2014-10-08 23:25:39 +00:00
Adam Nemet 47b2d5f1e0 [AVX512] Intrinsics for vextract*x4
This adds the Pat<>'s for the intrinsics.  These are necessary because we
don't lower these intrinsics to SDNodes but match them directly.  See the
rational in the previous commit.

llvm-svn: 219362
2014-10-08 23:25:37 +00:00
Adam Nemet 2b5cdbb3de [AVX512] Add asm-only support for vextract*x4 masking variants
These derive from the new asm-only masking definitions.

Unfortunately I wasn't able to find a ISel pattern that we could legally
generate for the masking variants.  The problem is that since the destination
is v4* we would need VK4 register classes and v4i1 value types to express the
masking.  These are however not legal types/classes in AVX512f but only in VL,
so things get complicated pretty quickly.  We can revisit this question later
if we have a more pressing need to express something like this.

So the ISel patterns are empty for the masking instructions and the next patch
will add Pat<>s instead to match the intrinsics calls with instructions.

llvm-svn: 219361
2014-10-08 23:25:33 +00:00
Adam Nemet 0937723b49 [AVX512] Move DAG for all-zero node to X86VectorVTInfo
No functional change.

No change in X86.td.expanded except for the appearance of the new attributes.

The new attributes will be used in the subsequent patch.

llvm-svn: 219360
2014-10-08 23:25:31 +00:00
Adam Nemet 52bb6cfad6 [AVX512] Peel off an asm-only class from AVX512_masking_common.
No functional change.

This enables the generation of masking instructions that don't provide a
ISel pattern.

llvm-svn: 219358
2014-10-08 23:25:23 +00:00
Robert Khasanov 44241440e1 [AVX512] Refactoring of avx512_binop_rm multiclass through AVX512_masking.
Added new argrument for AVX512_masking: InstrItinClass and bit isCommutable.
No functional change.

llvm-svn: 219310
2014-10-08 14:37:45 +00:00
Elena Demikhovsky 44bf0637d5 AVX-512-SKX: Added instruction VPMOVM2B/W/D/Q.
This instruction allows to broadacst mask vector to data vector.

llvm-svn: 219083
2014-10-05 14:11:08 +00:00
Adam Nemet 4dca3ce4b0 [AVX512] Pull pattern for subvector insert into the instruction definition
No functional change intended.

Very similar to the change I made for subvector extract in r218480.

test/CodeGen/X86/avx512-insert-extract.ll covers this.

llvm-svn: 218928
2014-10-02 23:18:30 +00:00
Adam Nemet 4e2ef472d2 [AVX512] Refactor subvector inserts
No functional change.

Very similar to the extract refactoring I did in r218478.

Compared X86.td.expanded before and after.

llvm-svn: 218927
2014-10-02 23:18:28 +00:00
Adam Nemet dc87aea176 [AVX512] Fix i256mem->f256mem typo in VINSERTF64x4rm
Just like in the case of extracts, the refactoring is uncovering some typos in
the code.

llvm-svn: 218926
2014-10-02 23:18:26 +00:00
Adam Nemet 05d8c8e682 [AVX512] Remove space before \t in AsmStrings.
llvm-svn: 218725
2014-10-01 00:41:32 +00:00
Robert Khasanov 5aa4445bde [AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ}
Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1.
Now cmp intrinsics lower in the following way:
 (i8 (int_x86_avx512_mask_pcmpeq_q_128
             (v2i64 %a), (v2i64 %b), (i8 %mask))) ->
 (i8 (bitcast
   (v8i1 (insert_subvector undef,
           (v2i1 (and (PCMPEQM %a, %b),
                      (extract_subvector
                         (v8i1 (bitcast %mask)), 0))), 0))))

llvm-svn: 218669
2014-09-30 11:41:54 +00:00
Adam Nemet 6bddb8c3a5 [AVX512] Use X86VectorVTInfo in the masking helper classes and the FMAs
No functionality change.

Makes the code more compact (see the FMA part).

This needs a new type attribute MemOpFrag in X86VectorVTInfo.  For now I only
defined this in the simple cases.  See the commment before the attribute.

Diff of X86.td.expanded before and after is empty except for the appearance of
the new attribute.

llvm-svn: 218637
2014-09-29 22:54:41 +00:00
Adam Nemet ce465421d7 [AVX512] Simplify use of !con()
No change in X86.td.expanded.

llvm-svn: 218485
2014-09-26 00:53:12 +00:00
Adam Nemet f7988d7364 [AVX512] Pull pattern for subvector extract into the instruction definition
No functional change.

I initially thought that pulling the Pat<> into the instruction pattern was
not possible because it was doing a transform on the index in order to convert
it from a per-element (extract_subvector) index into a per-chunk (vextract*x4)
index.

Turns out this also works inside the pattern because the vextract_extract
PatFrag has an OperandTransform EXTRACT_get_vextract{128,256}_imm, so the
index in $idx goes through the same conversion.

The existing test CodeGen/X86/avx512-insert-extract.ll extended in the
previous commit provides coverage for this change.

llvm-svn: 218480
2014-09-25 23:48:49 +00:00
Adam Nemet 55536c6a8f [AVX512] Refactor subvector extracts
No functional change.

These are now implemented as two levels of multiclasses heavily relying on the
new X86VectorVTInfo class.  The multiclass at the first level that is called
with float or int provides the 128 or 256 bit subvector extracts.  The second
level provides the register and memory variants and some more Pat<>s.

I've compared the td.expanded files before and after.  One change is that
ExeDomain for 64x4 is SSEPackedDouble now.  I think this is correct, i.e. a
bugfix.

(BTW, this is the change that was blocked on the recent tablegen fix.  The
class-instance values X86VectorVTInfo inside vextract_for_type weren't
properly evaluated.)

Part of <rdar://problem/17688758>

llvm-svn: 218478
2014-09-25 23:48:45 +00:00
Adam Nemet 6ea09eb148 [AVX512] Fix typo
F->I in VEXTRACTF32x4rr.

llvm-svn: 218477
2014-09-25 23:48:42 +00:00
Chandler Carruth ed5dfff865 [x86] Rename X86ISD::VPERMILP to X86ISD::VPERMILPI (and the same for the
td pattern). Currently we only model the immediate operand variation of
VPERMILPS and VPERMILPD, we should make that clear in the pseudos used.
Will be adding support for the variable mask variant in my next commit.

llvm-svn: 218282
2014-09-22 22:29:42 +00:00
Robert Khasanov f70f798474 [SKX] Deriving rmb multiclasses from general one (avx512_icmp_packed_rmb and avx512_icmp_cc_rmb).
Thanks Adam Nemet for notice about this.

llvm-svn: 218051
2014-09-18 14:06:55 +00:00
Chandler Carruth 373b2b1728 [x86] Fix a pretty horrible bug and inconsistency in the x86 asm
parsing (and latent bug in the instruction definitions).

This is effectively a revert of r136287 which tried to address
a specific and narrow case of immediate operands failing to be accepted
by x86 instructions with a pretty heavy hammer: it introduced a new kind
of operand that behaved differently. All of that is removed with this
commit, but the test cases are both preserved and enhanced.

The core problem that r136287 and this commit are trying to handle is
that gas accepts both of the following instructions:

  insertps $192, %xmm0, %xmm1
  insertps $-64, %xmm0, %xmm1

These will encode to the same byte sequence, with the immediate
occupying an 8-bit entry. The first form was fixed by r136287 but that
broke the prior handling of the second form! =[ Ironically, we would
still emit the second form in some cases and then be unable to
re-assemble the output.

The reason why the first instruction failed to be handled is because
prior to r136287 the operands ere marked 'i32i8imm' which forces them to
be sign-extenable. Clearly, that won't work for 192 in a single byte.
However, making thim zero-extended or "unsigned" doesn't really address
the core issue either because it breaks negative immediates. The correct
fix is to make these operands 'i8imm' reflecting that they can be either
signed or unsigned but must be 8-bit immediates. This patch backs out
r136287 and then changes those places as well as some others to use
'i8imm' rather than one of the extended variants.

Naturally, this broke something else. The custom DAG nodes had to be
updated to have a much more accurate type constraint of an i8 node, and
a bunch of Pat immediates needed to be specified as i8 values.

The fallout didn't end there though. We also then ceased to be able to
match the instruction-specific intrinsics to the instructions so
modified. Digging, this is because they too used i32 rather than i8 in
their signature. So I've also switched those intrinsics to i8 arguments
in line with the instructions.

In order to make the intrinsic adjustments of course, I also had to add
auto upgrading for the intrinsics.

I suspect that the intrinsic argument types may have led everything down
this rabbit hole. Pretty happy with the result.

llvm-svn: 217310
2014-09-06 10:00:01 +00:00
Robert Khasanov 29e3b96734 [SKX] Added new versions of cmp instructions in avx512_icmp_cc multiclass, added VL multiclass.
Added encoding tests

llvm-svn: 216532
2014-08-27 09:34:37 +00:00
Elena Demikhovsky ff620edd3c AVX-512: Added intrinsic for VMOVSS store form with mask.
llvm-svn: 216530
2014-08-27 07:38:43 +00:00
Robert Khasanov 2ea081d4d1 [SKX] avx512_icmp_packed multiclass extension
Extended avx512_icmp_packed multiclass by masking versions.
Added avx512_icmp_packed_rmb multiclass for embedded broadcast versions.
Added corresponding _vl multiclasses.
Added encoding tests for CPCMP{EQ|GT}* instructions.
Add more fields for X86VectorVTInfo.
Added AVX512VLVectorVTInfo that include X86VectorVTInfo for 512/256/128-bit versions

Differential Revision: http://reviews.llvm.org/D5024

llvm-svn: 216383
2014-08-25 14:49:34 +00:00
Adam Nemet 5ed17dad95 [AVX512] Add class to group common template arguments related to vector type
We discussed the issue of generality vs. readability of the AVX512 classes
recently.  I proposed this approach to try to hide and centralize the mappings
we commonly perform based on the vector type.  A new class X86VectorVTInfo
captures these.

The idea is to pass an instance of this class to classes/multiclasses instead
of the corresponding ValueType.  Then the class/multiclass can use its field
for things that derive from the type rather than passing all those as separate
arguments.

I modified avx512_valign to demonstrate this new approach.  As you can see
instead of 7 related template parameters we now have one.  The downside is
that we have to refer to fields for the derived values.  I named the argument
'_' in order to make this as invisible as possible.  Please let me know if you
absolutely hate this.  (Also once we allow local initializations in
multiclasses we can recover the original version by assigning the fields to
local variables.)

Another possible use-case for this class is to directly map things, e.g.:

  RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC

llvm-svn: 216209
2014-08-21 19:50:07 +00:00
Elena Demikhovsky 34d2d76d25 AVX-512: Fixed a bug in emitting compare for MVT:i1 type.
Added a test.

llvm-svn: 215889
2014-08-18 11:59:06 +00:00
Adam Nemet 2e91ee58fe [AVX512] Add masking variant for the FMA instructions
This change further evolves the base class AVX512_masking in order to make it
suitable for the masking variants of the FMA instructions.

Besides AVX512_masking there is now a new base class that instructions
including FMAs can use: AVX512_masking_3src.  With three-source (destructive)
instructions one of the sources is already tied to the destination.  This
difference from AVX512_masking is captured by this new class.  The common bits
between _masking and _masking_3src are broken out into a new super class
called AVX512_masking_common.

As with valign, there is some corresponding restructuring of the underlying
format classes.  The idea is the same we want to derive from two classes
essentially: one providing the format bits and another format-independent
multiclass supplying the various masking and non-masking instruction variants.

Existing fma tests in avx512-fma*.ll provide coverage here for the non-masking
variants.  For masking, the next patches in the series will add intrinsics and
intrinsic tests.

For AVX512_masking_3src to work, the (ins ...) dag has to be passed *without*
the leading source operand that is tied to dst ($src1).  This is necessary to
properly construct the (ins ...) for the different variants.  For the record,
I did check that if $src is mistakenly included, you do get a fairly intuitive
error message from the tablegen backend.

Part of <rdar://problem/17688758>

llvm-svn: 215660
2014-08-14 17:13:19 +00:00
Robert Khasanov ed8829703f [SKX] Extended non-temporal load/store instructions for AVX512VL subsets.
Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction.
Added encoding and lowering tests.

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 215536
2014-08-13 10:46:00 +00:00
Adam Nemet cee9d0a460 [AVX512] Handle valign masking intrinsic via C++ lowering
I think that this will scale better in most cases than adding a Pat<> for each
mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz).  We
can just lower to the SDNode and have the resulting DAG be matches by the DAG
patterns.

Alternatively (long term), we could keep the Pat<>s but generate them via the
new AVX512_masking multiclass.  The difficulty is that in order to formulate
that we would have to concatenate DAGs.  Currently this is only supported if
the operators of the input DAGs are identical.

llvm-svn: 215473
2014-08-12 21:13:12 +00:00
Elena Demikhovsky 40a77144a4 AVX-512: added a missing bitcast from v16f32 to v16i32
llvm-svn: 215351
2014-08-11 09:59:08 +00:00
Adam Nemet 7d498629f1 [AVX512] Add zero-masking variant to AVX512_masking multiclass
This completes one item from the todo-list of r215125 "Generate masking
instruction variants with tablegen".

The AddedComplexity is needed just like for the k variant.

Added a codegen test based on valignq.

llvm-svn: 215173
2014-08-07 23:53:38 +00:00
Adam Nemet fa1f7201fc [AVX512] Add codegen test for the masking variant of valign
The AddedComplexity is needed just like in avx512_perm_3src.  There may be a
bug in the complexity computation...

llvm-svn: 215168
2014-08-07 23:18:18 +00:00
Adam Nemet 2e2537f665 [AVX512] Generate masking instruction variants with tablegen
After adding the masking variants to several instructions, I have decided to
experiment with generating these from the non-masking/unconditional
variant. This will hopefully reduce the amount repetition that we currently
have in order to define an instruction with all its variants (for a reg/mem
instruction this would be 6 instruction defs and 2 Pat<> for the intrinsic).

The patch is the first cut that is currently only applied to valignd/q to make
the patch small.

A few notes on the approach:

  * In order to stitch together the dag for both the conditional and the
  unconditional patterns I pass the RHS of the set rather than the full
  pattern (set dest, RHS).
  * Rather than subclassing each instruction base class (e.g. AVX512AIi8),
  with a masking variant which wouldn't scale, I derived the masking
  instructions from a new base class AVX512 (this is just I<> with
  Requires<HasAVX512>).  The instructions derive from this now, plus a new set
  of classes that add the format bits and everything else that instruction
  base class provided (i.e. AVX512AIi8 vs. AVX512AIi8Base).

I hope we can go incrementally from here.  I expect that:

  * We will need different variants of the masking class.  One example is
  instructions requiring three vector sources.  In this case we tie one of the
  source operands to dest rather than a new implicit source operand ($src0)
  * Add the zero-masking variant
  * Add more AVX512*Base classes as new uses are added

I've looked at X86.td.expanded before and after to make sure that nothing got
lost for valignd/q.

llvm-svn: 215125
2014-08-07 17:53:55 +00:00
Adam Nemet 5ec912881f [X86] Fixes commit r214890 to match the posted patch
This was another fallout from my local rebase where something went wrong :(

llvm-svn: 214951
2014-08-06 07:13:12 +00:00
Adam Nemet fd2161b710 [AVX512] Add masking variant and intrinsics for valignd/q
This is similar to what I did with the two-source permutation recently.  (It's
almost too similar so that we should consider generating the masking variants
with some tablegen help.)

Both encoding and intrinsic tests are added as well.  For the latter, this is
what the IR that the intrinsic test on the clang side generates.

Part of <rdar://problem/17688758>

llvm-svn: 214890
2014-08-05 17:23:04 +00:00
Adam Nemet 164b07fbfe [X86] Add lowering to VALIGN
This was currently part of lowering to PALIGNR with some special-casing to
make interlane shifting work.  Since AVX512F has interlane alignr (valignd/q)
and AVX512BW has vpalignr we need to support both of these *at the same time*,
e.g. for SKX.

This patch breaks out the common code and then add support to check both of
these lowering options from LowerVECTOR_SHUFFLE.

I also added some FIXMEs where I think the AVX512BW and AVX512VL additions
should probably go.

llvm-svn: 214888
2014-08-05 17:22:59 +00:00
Adam Nemet d00a05e3e2 [AVX512] alignr: Use suffix rather than name argument to multiclass
Again no functional change.  This prepares for the suffix to be used with the
intrinsic matching.

llvm-svn: 214886
2014-08-05 17:22:52 +00:00
Adam Nemet f92139dd61 [AVX512] Pull everything alignr-related into the multiclass
The packed integer pattern becomes the DAG pattern for rri and the packed
float, another Pat<> inside the multiclass.

No functional change.

llvm-svn: 214885
2014-08-05 17:22:50 +00:00
Adam Nemet 1c752d8f5e Wrap long lines
llvm-svn: 214884
2014-08-05 17:22:47 +00:00
Robert Khasanov 7ca7df0bf9 [SKX] Enabling load/store instructions: encoding
Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS,

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214719
2014-08-04 14:35:15 +00:00
Robert Khasanov 595683da00 [SKX] Enabling mask logic instructions: encoding, lowering
Instructions: KAND{BWDQ}, KANDN{BWDQ}, KOR{BWDQ}, KXOR{BWDQ}, KXNOR{BWDQ}

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214081
2014-07-28 13:46:45 +00:00
Robert Khasanov 74acbb7767 [SKX] Enabling mask instructions: encoding, lowering
KMOVB, KMOVW, KMOVD, KMOVQ, KNOTB, KNOTW, KNOTD, KNOTQ

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 213757
2014-07-23 14:49:42 +00:00
Elena Demikhovsky f164859efc AVX-512: Fixed intrinsic of VSQRTPS/PD instructions.
I set number and types of parameters according to GCC intrinsics.

llvm-svn: 213640
2014-07-22 11:07:31 +00:00
Robert Khasanov bfa0131365 [SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features.
Enabling HasAVX512{DQ,BW,VL} predicates.
Adding VK2, VK4, VK32, VK64 masked register classes.
Adding new types (v64i8, v32i16) to VR512.
Extending calling conventions for new types (v64i8, v32i16)

Patch by Zinovy Nis <zinovy.y.nis@intel.com>
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 213545
2014-07-21 14:54:21 +00:00
Adam Nemet 79580db918 [X86] AVX512: Only allow k1-k7 as predicates to vpcmp*
As destination k0 is allowed but not as predicate/writemask.

I also modified the test to allow checking of error messages by the assembler.
I applied a similar approach to the test ret.s in the same directory.

llvm-svn: 212504
2014-07-08 00:22:32 +00:00
Adam Nemet 11dd5cf9f1 [X86] AVX512: Allow writemask argument in vpermt* intrinsics
llvm-svn: 212223
2014-07-02 21:26:01 +00:00
Adam Nemet efe9c98a16 [X86] AVX512: Generate Pat<>'s for the vpermt2* intrinsics via multiclass
This new multiclass, avx512_perm_table_3src derives from the current one and
provides the Pat<>.  The next patch will add another Pat<> that uses the
writemask.

Note that I dropped the type annotation from the intrinsic call, i.e.: (v16f32
VR512:$src1) -> R512:$src1.  I think that this should be fine (at least many
intrinsic calls don't provide them) and it greatly reduces the number of
template arguments.

llvm-svn: 212222
2014-07-02 21:25:58 +00:00
Adam Nemet 2415a497b5 [X86] AVX512: Add writemask variants for vperm*2*
This includes assembler and codegen support (see the new tests in
avx512-encodings.s and avx512-shuffle.ll).

<rdar://problem/17492620>

llvm-svn: 212221
2014-07-02 21:25:54 +00:00
Adam Nemet 16de2486cb [X86] AVX512: Allow writemasks with vpcmp
For now I only updated the _alt variants.  The main variants are used by
codegen and that will need a bit more work to trigger.

<rdar://problem/17492620>

llvm-svn: 212114
2014-07-01 18:03:45 +00:00
Adam Nemet 1efcb90fcd [X86] AVX512: Factor generating the AsmString into avx512_icmp_cc
Adding a writemask variant would require a third asm string to be passed to
the template.  Generate the AsmString in the template instead.

No change in X86.td.expanded.

llvm-svn: 212113
2014-07-01 18:03:43 +00:00
Adam Nemet 73f72e15ac [X86] AVX512: Add vbroadcasti*
For now I used a separate template for these sub-vector/tuple broadcasts
rather than sharing the mem variants with avx512_int_broadcast_rm.

<rdar://problem/17402869>

llvm-svn: 211828
2014-06-27 00:43:38 +00:00
Adam Nemet 905832bf87 [X86] AVX512: Fix asm syntax for packed vcmp
The *_alt defs for vcmp are used by the InstParser (the asm string in the main
def is used by the InstPrinter) .  The former was accepting vector registers
as destination rather than mask registers.

llvm-svn: 211750
2014-06-26 00:21:12 +00:00
Adam Nemet efd0785d82 [X86] AVX512: Add non-temporal stores
Note that I followed the AVX2 convention here and didn't add LLVM intrinsics
for stores.  These can be generated with the nontemporal hint on LLVM IR
stores (see new test). The GCC builtins are lowered directly into nontemporal
stores.

<rdar://problem/17082571>

llvm-svn: 211176
2014-06-18 16:51:10 +00:00
Adam Nemet ded81a810c [X86] AVX512: Specify compressed displacement for vmovntdqa
Use the max 64-bit element size with EVEX_CD8.  This should work since element
size is ignored for a full-vector access (FVM).

llvm-svn: 211175
2014-06-18 16:51:07 +00:00
Cameron McInally f10a7c963b Add pattern for unsigned v4i32->v4f64 convert on AVX512.
llvm-svn: 211164
2014-06-18 14:04:37 +00:00