Quentin Colombet
246b6fcd28
[X86] Selectively mark the FMA variants inside a family as isCommutable.
...
Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or
memory) are commutable.
E.g., for the 213 family (with the syntax src1, src2, src3):
fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3
Now consider the 231 family:
fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B
But
fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B
Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231.
Working on a reduced test case!
<rdar://problem/16800495>
llvm-svn: 208252
2014-05-07 21:43:35 +00:00
Lang Hames
c2c751312e
[X86] Make the VFMA*231 variants commutable and relax the alignment restrictions
...
on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load
from unaligned memory.
Testcase to follow, along with related patch.
<rdar://problem/16478629>
llvm-svn: 205472
2014-04-02 22:06:16 +00:00
Lang Hames
3303a339b1
[X86] Only 213 FMA3 variants should be marked commutable.
...
Commuting the 231 and 132 variants would swap addends and
multiplicands/multipliers, which isn't valid.
I'm still trying to reduce a decent test case for this.
llvm-svn: 200792
2014-02-04 19:42:47 +00:00
Lang Hames
5ec150c967
Replace X86 FMA intrinsic pseduo-instructions with def pats.
...
It looks like these pseudos were only used for pattern matching. Def pats are
the appropriate way to do that. As a bonus, these intrinsics will now have
memory operands folded properly, and better FMA3 variants selected where
appropriate (see r199933).
<rdar://problem/15611947>
llvm-svn: 200577
2014-01-31 21:29:19 +00:00
Lang Hames
23de211c5d
Replace vfmaddxx213 instructions with their 231-type equivalents in accumulator
...
loops. Writing back to the accumulator (231-type) allows the coalescer to
eliminate an extra copy.
llvm-svn: 199933
2014-01-23 20:23:36 +00:00
Craig Topper
3484fc2161
Add a new x86 specific instruction flag to force some isCodeGenOnly instructions to go through to the disassembler tables without resorting to string matches. Apply flag to all _REV instructions.
...
llvm-svn: 198543
2014-01-05 04:17:28 +00:00
Craig Topper
9dd48c8ed4
Mark all x86 Int_ and _Int patterns as isCodeGenOnly so the disassembler table builder doesn't need to string match them to exclude them.
...
llvm-svn: 198323
2014-01-02 17:28:14 +00:00
Craig Topper
ed59dd34fd
Various x86 disassembler fixes.
...
Add VEX_LIG to scalar FMA4 instructions.
Use VEX_LIG in some of the inheriting checks in disassembler table generator.
Make use of VEX_L_W, VEX_L_W_XS, VEX_L_W_XD contexts.
Don't let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from their non-L forms unless VEX_LIG is set.
Let VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE inherit from all of their non-L or non-W cases.
Increase ranking on VEX_L_W, VEX_L_W_XS, VEX_L_W_XD, VEX_L_W_OPSIZE so they get chosen over non-L/non-W forms.
llvm-svn: 191649
2013-09-30 02:46:36 +00:00
Craig Topper
58f6e64e07
Remove alignment restrictions from FMA load folding.
...
llvm-svn: 191136
2013-09-21 05:58:59 +00:00
Craig Topper
0d2c29e807
Simplify nested strconcats in X86 td files since strconcat can take more than 2 arguments.
...
llvm-svn: 172379
2013-01-14 07:46:34 +00:00
Craig Topper
1b8c0750ee
Mark all the _REV instructions as not having side effects. They aren't really emitted by the backend, but it reduces the number of instructions in the output files with unmodelled side effects to make auditing easier.
...
llvm-svn: 171118
2012-12-26 21:30:22 +00:00
Craig Topper
72beaa6733
Fix execution domain for packed FMA4 instructions.
...
llvm-svn: 168417
2012-11-21 08:08:21 +00:00
Craig Topper
a73be890a1
Add explicit VEX_L tags to all 256-bit instructions. This will allow us to remove code from the code emitters that examined operands to set the L-bit.
...
llvm-svn: 164202
2012-09-19 06:06:34 +00:00
Craig Topper
908e685102
Mark FMA4 instructions as commutable and add them to the folding tables.
...
llvm-svn: 163035
2012-08-31 23:10:34 +00:00
Craig Topper
c0387f6b23
Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted.
...
llvm-svn: 163001
2012-08-31 16:31:13 +00:00
Craig Topper
c30fdbc46c
Add support for converting llvm.fma to fma4 instructions.
...
llvm-svn: 162999
2012-08-31 15:40:30 +00:00
Craig Topper
a999c66292
Convert FMA4 patterns to use target specific nodes instead of intrinsics to align with FMA3.
...
llvm-svn: 162829
2012-08-29 07:18:25 +00:00
Jakob Stoklund Olesen
8ff666fcb6
Remove more mayLoad workarounds.
...
llvm-svn: 162556
2012-08-24 14:43:22 +00:00
Craig Topper
663d160adb
Custom lower FMA intrinsics to target specific nodes and remove the patterns.
...
llvm-svn: 162534
2012-08-24 04:03:22 +00:00
Craig Topper
384fae2f0d
Cleanup the scalar FMA3 definitions. Add patterns to fold loads with scalar forms.
...
llvm-svn: 162260
2012-08-21 07:11:11 +00:00
Craig Topper
4f3879dfa7
Merge FMA3 instructions with and without patterns into single classes using null_frag.
...
llvm-svn: 162257
2012-08-21 05:56:45 +00:00
Craig Topper
b58eec4eaf
Remove FMA3 intrinsic instructions in favor of patterns.
...
llvm-svn: 162194
2012-08-20 06:21:25 +00:00
Craig Topper
37eca54912
Use correct intrinsic for 256-bit VFMSUBADDPS.
...
llvm-svn: 162193
2012-08-20 06:03:04 +00:00
Craig Topper
5122e9f194
Remove trailing white space and tab characters. No functional change.
...
llvm-svn: 162192
2012-08-19 23:37:46 +00:00
Elena Demikhovsky
3cb3b0045c
Added FMA functionality to X86 target.
...
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Craig Topper
c6ac4cefcc
Add intrinsic forms for FMA instructions to opcode folding tables.
...
llvm-svn: 157917
2012-06-04 07:46:16 +00:00
Craig Topper
fd53b80219
Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
...
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Craig Topper
29eafea292
Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
...
llvm-svn: 157895
2012-06-03 01:40:43 +00:00
Craig Topper
badd755a0e
Add neverHasSideEffects and mayLoad to FMA3 instructions.
...
llvm-svn: 157894
2012-06-03 00:30:49 +00:00
Craig Topper
00649d5111
Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
...
llvm-svn: 157804
2012-06-01 06:07:48 +00:00
Craig Topper
df09da8355
Tidy up. Remove trailing spaces and fix the worst of the 80 column violations.
...
llvm-svn: 157799
2012-06-01 05:24:29 +00:00
Elena Demikhovsky
602f3a26d6
Added FMA3 Intel instructions.
...
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.
llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Jia Liu
e1d619691b
some comment fix for X86 and ARM
...
llvm-svn: 150902
2012-02-19 02:03:36 +00:00
Jia Liu
b22310fda6
Emacs-tag and some comment fix for all ARM, CellSPU, Hexagon, MBlaze, MSP430, PPC, PTX, Sparc, X86, XCore.
...
llvm-svn: 150878
2012-02-18 12:03:15 +00:00
Craig Topper
29b0737452
Mark scalar FMA4 instructions as ignoring the VEX.L bit.
...
llvm-svn: 147602
2012-01-05 08:56:10 +00:00
Craig Topper
03a0beda88
Add FMA4 instructions to disassembler.
...
llvm-svn: 147367
2011-12-30 05:20:36 +00:00
Craig Topper
cd93de93fa
Separate the concept of having memory access in operand 4 from the concept of having the W bit set for XOP instructons. Removes ORing W-bits in the encoder and will similarly simplify the disassembler implementation.
...
llvm-svn: 147366
2011-12-30 04:48:54 +00:00
Craig Topper
c0f9bcb5d5
Combine FMA4 SS/SD patterns with the instruction definitions.
...
llvm-svn: 147365
2011-12-30 03:33:59 +00:00
Craig Topper
51fe43fcd9
Combine FMA4 PS/PD patterns with the instruction definitions.
...
llvm-svn: 147364
2011-12-30 03:17:15 +00:00
Craig Topper
6c08930c5e
Change FMA4 memory forms to use memopv* instead of alignedloadv*. No need to force alignment on these instructions. Add a couple testcases for memory forms.
...
llvm-svn: 147361
2011-12-30 02:18:36 +00:00
Craig Topper
2ca79b9d4b
Fix load size for FMA4 SS/SD instructions. They need to use f32 and f64 size, but with the special handling to be compatible with the intrinsic expecting a vector. Similar handling is already used elsewhere.
...
llvm-svn: 147360
2011-12-30 01:49:53 +00:00
Craig Topper
d773607eee
Fix execution domains for PS/PD FMA3 instructions. Add SS/SD forms o FMA3 instructions.
...
llvm-svn: 147353
2011-12-29 20:43:40 +00:00
Craig Topper
8cab06a214
Expose FMA3 instructions to the disassembler.
...
llvm-svn: 147351
2011-12-29 20:03:14 +00:00
Jan Sjödin
d19760a40c
Src2 and src3 were accidentally swapped for the FMA4 rr patterns. Undo this and fix the encoding.
...
llvm-svn: 146151
2011-12-08 14:43:19 +00:00
Jan Sjödin
9430e284a9
Support for encoding all FMA4 instructions and tablegen patterns for all
...
remaining FMA4 instructions and intrinsics with tests.
llvm-svn: 145525
2011-11-30 22:09:42 +00:00
Bruno Cardoso Lopes
0f9a1f5e6c
This patch contains support for encoding FMA4 instructions and
...
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.
Patch by Jan Sjodin
llvm-svn: 145133
2011-11-25 19:33:42 +00:00
Bruno Cardoso Lopes
acd9230b1b
Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual
...
llvm-svn: 109204
2010-07-23 00:54:35 +00:00