llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	1a0da2db5f	[X86] Add support for combining FMADDSUB(A, B, FNEG(C))->FMSUBADD(A, B, C) Support the opposite direction as well. Also add a TODO for not being able to combine FMSUB/FNMADD/FNMSUB with FNEG. llvm-svn: 317878	2017-11-10 08:22:37 +00:00
Craig Topper	93e27d2ecc	[X86] Make sure we don't read too many operands from X86ISD::FMADDS1/FMADDS3 nodes when doing FNEG combine. r317453 added new ISD nodes without rounding modes that were added to an existing if/else chain. But all the previous nodes handled there included a rounding mode. The final code after this if/else chain expected an extra operand that isn't present for the new nodes. llvm-svn: 317748	2017-11-09 01:06:47 +00:00
Guy Blank	548e22a1a7	[X86][AVX512] Make i1 illegal in the CodeGen This patch defines the i1 type as illegal in the X86 backend for AVX512. For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended. This should produce better scalar code for i1 types since GPRs will be used instead of mask registers. Differential Revision: https://reviews.llvm.org/D32273 llvm-svn: 303421	2017-05-19 12:35:15 +00:00
Craig Topper	d284606327	[AVX-512] Remove explicit KMOVWrk/KMOVWKr instructions from patterns where we can just use COPY_TO_REGCLASS instead. This will result in a KMOVW or KMOVD being emitted during register allocation. And in at least some cases this might allow the register coalescer to remove the copy all together. llvm-svn: 298984	2017-03-29 06:55:28 +00:00
Craig Topper	058f2f6d72	[AVX-512] Fix accidental uses of AH/BH/CH/DH after copies to/from mask registers We've had several bugs(PR32256, PR32241) recently that resulted from usages of AH/BH/CH/DH either before or after a copy to/from a mask register. This ultimately occurs because we create COPY_TO_REGCLASS with VK1 and GR8. Then in CopyToFromAsymmetricReg in X86InstrInfo we find a 32-bit super register for the GR8 to emit the KMOV with. But as these tests are demonstrating, its possible for the GR8 register to be a high register and we end up doing an accidental extra or insert from bits 15:8. I think the best way forward is to stop making copies directly between mask registers and GR8/GR16. Instead I think we should restrict to only copies between mask registers and GR32/GR64 and use EXTRACT_SUBREG/INSERT_SUBREG to handle the conversion from GR32 to GR16/8 or vice versa. Unfortunately, this complicates fastisel a bit more now to create the subreg extracts where we used to create GR8 copies. We can probably make a helper function to bring down the repitition. This does result in KMOVD being used for copies when BWI is available because we don't know the original mask register size. This caused a lot of deltas on tests because we have to split the checks for KMOVD vs KMOVW based on BWI. Differential Revision: https://reviews.llvm.org/D30968 llvm-svn: 298928	2017-03-28 16:35:29 +00:00
Craig Topper	2caa97c891	[AVX-512] Fix the execution domain for scalar FMA instructions. llvm-svn: 296271	2017-02-25 19:36:28 +00:00
Craig Topper	a74e3088df	[AVX-512] Remove patterns from the other VBLENDM instructions. They are all redundant with masked move instructions. We should probably teach the two address instruction pass to turn masked moves into BLENDM when its beneficial to the register allocator. llvm-svn: 291371	2017-01-07 22:20:34 +00:00
Craig Topper	a55b483bb5	[AVX-512] Correctly preserve the passthru semantics of the FMA scalar intrinsics Summary: Scalar intrinsics have specific semantics about the which input's upper bits are passed through to the output. The same input is also supposed to be the input we use for the lower element when the mask bit is 0 in a masked operation. We aren't currently keeping these semantics with instruction selection. This patch corrects this by introducing new scalar FMA ISD nodes that indicate whether operand 1(one of the multiply inputs) or operand 3(the additon/subtraction input) should pass thru its upper bits. We use this information to select 213/132 form for the operand 1 version and the 231 form for the operand 3 version. We also use this information to suppress combining FNEG operations on the passthru input since semantically the passthru bits aren't negated. This is stronger than the earlier check added for a user being SELECTS so we can remove that. This fixes PR30913. Reviewers: delena, zvi, v_klochkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27144 llvm-svn: 289190	2016-12-09 06:42:28 +00:00
Elena Demikhovsky	519b4ccd70	Fixed FMA + FNEG combine. Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 llvm-svn: 285492	2016-10-29 08:44:46 +00:00
Elena Demikhovsky	f0ddd1b8b5	AVX512F: FMA intrinsic + FNEG - sequence optimization The previous commit (r280368 - https://reviews.llvm.org/D23313) does not cover AVX-512F, KNL set. FNEG(x) operation is lowered to (bitcast (vpxor (bitcast x), (bitcast constfp(0x80000000))). It happens because FP XOR is not supported for 512-bit data types on KNL and we use integer XOR instead. I added pattern match for integer XOR. Differential Revision: https://reviews.llvm.org/D24221 llvm-svn: 280785	2016-09-07 06:54:28 +00:00
Elena Demikhovsky	4d7738dfde	Optimized FMA intrinsic + FNEG , like -(ab+c) and FNEG + FMA, like ab-c or (-a)*b+c. The bug description is here : https://llvm.org/bugs/show_bug.cgi?id=28892 Differential revision: https://reviews.llvm.org/D23313 llvm-svn: 280368	2016-09-01 13:58:53 +00:00
Craig Topper	713085e60a	[X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958	2016-08-29 04:49:31 +00:00
Vyacheslav Klochkov	6daefcf626	X86-FMA3: Implemented commute transformation for EVEX/AVX512 FMA3 opcodes. This helped to improved memory-folding and register coalescing optimizations. Also, this patch fixed the tracker #17229. Reviewer: Craig Topper. Differential Revision: https://reviews.llvm.org/D23108 llvm-svn: 278431	2016-08-11 22:07:33 +00:00
Elena Demikhovsky	0e0e07f436	AVX-512: A new test for FMA intrinsic A new test that explores sub-optimal sequence of FMA intrinsic and FNEG operation. An upcoming patch will fix it. llvm-svn: 278117	2016-08-09 11:54:14 +00:00

14 Commits