Commit Graph

13667 Commits

Author SHA1 Message Date
Simon Pilgrim c8ad5c069c [X86][AVX] Don't use SubVectorBroadcast if there are additional users of the chain (PR29088)
We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload

llvm-svn: 279441
2016-08-22 16:47:55 +00:00
Simon Pilgrim 13fa33012b [X86] Only accept SM_SentinelUndef (-1) as an undefined shuffle mask in range
As discussed on D23027 we should be trying to be more strict on what is an undefined mask value.

llvm-svn: 279435
2016-08-22 13:18:56 +00:00
Simon Pilgrim 2279e59573 [X86][SSE] Avoid specifying unused arguments in SHUFPD lowering
As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle.

This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur.

A fix to match against MOVHPD stores was necessary as well.

This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956

Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles.

Differential Revision: https://reviews.llvm.org/D23027

llvm-svn: 279430
2016-08-22 12:56:54 +00:00
Craig Topper 5f8419da34 [X86] Create a new instruction format to handle 4VOp3 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling.
llvm-svn: 279424
2016-08-22 07:38:50 +00:00
Craig Topper 9b20fece81 [X86] Create a new instruction format to handle MemOp4 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling.
llvm-svn: 279423
2016-08-22 07:38:45 +00:00
Craig Topper 61b62e56b7 [X86] Space out the encodings of X86 instruction formats. I plan to add some new encodings in future commits and this will reduce the size of those commits. NFC
This tries to keep all the ModRM memory and register forms in their own regions of the encodings. Hoping to make it simple on some of the switch statements that operate on these encodings.

llvm-svn: 279422
2016-08-22 07:38:41 +00:00
Craig Topper ca0eda3e6a [X86] Merge hasVEX_i8ImmReg into the ImmFormat type which had extra unused encodings. This saves one bit in TSFlags. NFC
llvm-svn: 279412
2016-08-22 01:37:19 +00:00
Craig Topper 522541231a [X86] Remove ignoreVEX_L from TSFlags. Only the disassembler needs it and the disassembler doesn't use TSFlags. NFC
llvm-svn: 279411
2016-08-22 01:37:16 +00:00
Simon Pilgrim 67e7e22462 [X86][AVX] Dropped combineShuffle256 - this can now be performed by EltsFromConsecutiveLoads
llvm-svn: 279397
2016-08-21 15:39:45 +00:00
Guy Blank 9ae797a798 [AVX512][FastISel] Do not use K registers in TEST instructions
In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal.
Changed to using KORTEST when dealing with K regs.

Differential Revision: https://reviews.llvm.org/D23163

llvm-svn: 279393
2016-08-21 08:02:27 +00:00
Simon Pilgrim d7a3782ae4 [X86][SSE] Generalised combining to VZEXT_MOVL to any vector size
This doesn't change tests codegen as we already combined to blend+zero which is what we lower VZEXT_MOVL to on SSE41+ targets, but it does put us in a better position when we improve shuffling for optsize.

llvm-svn: 279273
2016-08-19 17:02:00 +00:00
Simon Pilgrim f1b8fdc074 [X86][SSE] Add support for matching commuted insertps patterns
INSERTPS doesn't fit well with our shuffle mask canonicalization, so we need to attempt both the original mask and the commuted mask to more likely get a match

llvm-svn: 279230
2016-08-19 10:31:53 +00:00
Dean Michael Berris 1dd1ca9727 [XRay] Synthesize a reference to the xray_instr_map
Without the synthesized reference to a symbol in the xray_instr_map,
linker section garbage collection will helpfully remove the whole
xray_instr_map section from the final executable (or archive). This will
cause the runtime to not be able to identify the sleds and hot-patch the
calls/jumps into the runtime trampolines.

This change adds a reference from the text section at the end of the
function to keep around the associated xray_instr_map section as well.

We also make sure that we catch this reference in the test.

Reviewers: chandlerc, echristo, majnemer, mehdi_amini

Subscribers: mehdi_amini, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D23398

llvm-svn: 279204
2016-08-19 04:44:30 +00:00
Andrew Kaylor 81901d658f Include X86CallFrameOptimization in the opt-bisect process.
Differential Revision: https://reviews.llvm.org/D23683

llvm-svn: 279175
2016-08-18 22:49:51 +00:00
Michael Kuperstein 2bc3d4d46c [SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.

Differential Revision: https://reviews.llvm.org/D23597

llvm-svn: 279129
2016-08-18 20:08:15 +00:00
Eugene Zelenko 61a72d8850 [LLVM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings
Differential revision: https://reviews.llvm.org/D23675

llvm-svn: 279102
2016-08-18 17:56:27 +00:00
Sanjay Patel 7d37b221a2 fix typo; NFC
llvm-svn: 279068
2016-08-18 14:17:34 +00:00
Simon Pilgrim 916485c765 Remove trailing whitespace
llvm-svn: 279054
2016-08-18 11:22:22 +00:00
Justin Bogner cd1d5aaf2e Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Justin Bogner b03fd12cef Replace "fallthrough" comments with LLVM_FALLTHROUGH
This is a mechanical change of comments in switches like fallthrough,
fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead.

llvm-svn: 278902
2016-08-17 05:10:15 +00:00
Sanjay Patel 904cd39b05 [x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants.
This patch handles 64-bit constants which can be encoded as 32-bit immediates.

It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants.

Patch by Sunita Marathe!

Differential Revision: https://reviews.llvm.org/D23391

llvm-svn: 278857
2016-08-16 21:35:16 +00:00
Simon Pilgrim cc316f013a [X86][SSE] Add support for combining v2f64 target shuffles to VZEXT_MOVL byte rotations
The combine was only matching v2i64 as it assumed lowering to MOVQ - but we have v2f64 patterns that match in a similar fashion

llvm-svn: 278794
2016-08-16 12:52:06 +00:00
Simon Pilgrim f16cd361d4 [X86][SSE] Add support for combining target shuffles to PALIGNR byte rotations
llvm-svn: 278787
2016-08-16 10:03:23 +00:00
Guy Blank 722caebdae [X86] Add xgetbv/xsetbv intrinsics to non-windows platforms
Differential Revision: https://reviews.llvm.org/D21958

llvm-svn: 278782
2016-08-16 06:41:00 +00:00
Craig Topper f774de6d54 [X86] PADDUSB/W instructions should be commutable.
llvm-svn: 278654
2016-08-15 06:31:57 +00:00
Craig Topper 80c8b80919 [X86] Mark some of the X86 SDNodes as commutative.
llvm-svn: 278653
2016-08-15 04:47:30 +00:00
Craig Topper dbc387cfc9 [X86] X86ISD::FANDN is not commutative or associative.
llvm-svn: 278652
2016-08-15 04:47:28 +00:00
Craig Topper 37e8c5443c [AVX-512] Mark VPMADDWD as commutable to match SSE/AVX version.
llvm-svn: 278629
2016-08-14 17:57:22 +00:00
Craig Topper c677e97dff [AVX-512] Add masked commutable floating point max/min instructions to folding tables.
llvm-svn: 278628
2016-08-14 17:57:19 +00:00
Craig Topper 29fbdc309a [AVX-512] Add masked logical operations to memory folding tables.
llvm-svn: 278627
2016-08-14 17:57:16 +00:00
Igor Breger 505f2cc468 [AVX512] Fix VFPCLASSSD/VFPCLASSSS intrinsic lowering. The i1 result should be zero extended according to SPEC.
Differential Revision: http://reviews.llvm.org/D23489

llvm-svn: 278626
2016-08-14 13:58:57 +00:00
Igor Breger 8672408db0 [AVX512] Fix insertelement i1 lowering.
1. Use shuffle to insert element i1 into vector. The previous implementation was incorrect ( dest_bit OR src_bit , it doesn't clear the bit if src_bit=0 )
2. Improve shuffle i1 vector, use CVT2MASK if supported instead TRUNCATE.

Differential Revision: http://reviews.llvm.org/D23347

llvm-svn: 278623
2016-08-14 05:25:07 +00:00
Craig Topper 8c372a31b7 [X86] Add a check of isCommutable at the top of X86InstrInfo::findCommutedOpIndices. Most callers don't check if the instruction is commutable before calling.
This saves us the trouble of ending up in the default of the switch and having to determine if this is an FMA or not.

llvm-svn: 278597
2016-08-13 06:48:44 +00:00
Craig Topper eafdbecc44 [AVX-512] Add isCommutable to scalar FMA3 instructions.
llvm-svn: 278596
2016-08-13 06:48:41 +00:00
Craig Topper 5f2441d8f3 [AVX-512] Add commutable flags to 132 form FMA3 instructions.
llvm-svn: 278595
2016-08-13 06:48:39 +00:00
Craig Topper e5115aa4ca [X86] Remove patterns for (vzmovl (insert_subvector undef, (scalar_to_vector))) as the (vzmovl VR256) pattern has higher priority. NFC
llvm-svn: 278594
2016-08-13 06:02:19 +00:00
Craig Topper 3f8126e6fa [AVX-512] Remove an AddedComplexity that was prioritizing basic vzmovl patterns over more complex ones that produce better code.
llvm-svn: 278593
2016-08-13 05:43:20 +00:00
Craig Topper 600685d510 [AVX-512] Add patterns to support VZEXT_MOVL from 512-bit vectors with 64-bit and 32-bit elements.
Fixes PR28961.

llvm-svn: 278592
2016-08-13 05:33:12 +00:00
Hans Wennborg 0dd9ed1d45 Fix more dereferenced end() iterators after r278532
llvm-svn: 278587
2016-08-13 01:12:49 +00:00
Hans Wennborg 2d87ccfd58 X86: Fix another dereferenced end() iterator after r278532
llvm-svn: 278577
2016-08-12 23:35:59 +00:00
Duncan P. N. Exon Smith 69b0650548 X86: Stop dereferencing end() in X86FrameLowering::emitEpilogue
On a Windows build of Chromium, r278532 (up to r278539)
X86FrameLowering::emitEpilogue because it wasn't wary enough of the
return of MachineBasicBlock::getFirstTerminator.  Guard all the uses
here.

Note that r278532 *looks* like an NFC commit (just an API change), but
it removes a couple of layers of abstraction and is probably causing
optimization differences in MSVC.

llvm-svn: 278572
2016-08-12 22:43:33 +00:00
Artur Pilipenko 87e4038a91 [x86] X86ISelLowering zext(add_nuw(x, C)) --> add(zext(x), C_zext)
Currently X86ISelLowering has a similar transformation for sexts:
sext(add_nsw(x, C)) --> add(sext(x), C_sext)

In this change I extend this code to handle zexts as well.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D23359

llvm-svn: 278520
2016-08-12 16:08:30 +00:00
Simon Pilgrim 687d71e877 [X86][SSE] Add support for combining target shuffles to PSLLDQ/PSRLDQ byte shifts
llvm-svn: 278502
2016-08-12 11:24:34 +00:00
Simon Pilgrim ed96b9adfb [X86][SSE] Fixed PALIGNR target shuffle decode
The PALIGNR target shuffle decode was not taking into account that DecodePALIGNRMask (rather oddly) expects the operands to be in reverse order, nor was it detecting unary patterns, causing combines to combine with the incorrect input.

The cgbuiltin, auto upgrade and instruction comments code correctly swap the operands so are not affected.

llvm-svn: 278494
2016-08-12 10:10:51 +00:00
David Majnemer 42531260b3 Use the range variant of find/find_if instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278469
2016-08-12 03:55:06 +00:00
David Majnemer 562e82945e Use the range variant of find_if instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278443
2016-08-12 00:18:03 +00:00
David Majnemer 0d955d0bf5 Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

llvm-svn: 278433
2016-08-11 22:21:41 +00:00
Vyacheslav Klochkov 6daefcf626 X86-FMA3: Implemented commute transformation for EVEX/AVX512 FMA3 opcodes.
This helped to improved memory-folding and register coalescing optimizations.

Also, this patch fixed the tracker #17229.

Reviewer: Craig Topper.
Differential Revision: https://reviews.llvm.org/D23108

llvm-svn: 278431
2016-08-11 22:07:33 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Duncan P. N. Exon Smith 62e351f5a4 X86: Use operator lookup for operator==, NFC
Avoid relying on the MachineInstrBundleIterator operator== being
implemented as a member function.

llvm-svn: 278347
2016-08-11 15:51:29 +00:00