llvm-project

Commit Graph

Author	SHA1	Message	Date
Haojian Wu	eab33cecf3	Fix -Wunused-but-set-variable warning. Summary: A follow-up fix on r279958. Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23989 llvm-svn: 279964	2016-08-29 12:26:33 +00:00
Igor Breger	1a388871b9	[AVX512] In some cases KORTEST instruction may be used instead of ZEXT + TEST sequence. Differential Revision: http://reviews.llvm.org/D23490 llvm-svn: 279960	2016-08-29 08:52:52 +00:00
Craig Topper	713085e60a	[X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958	2016-08-29 04:49:31 +00:00
Craig Topper	f0e822ff31	[AVX-512] Always use v8i64 when converting 512-bit FAND/FOR/FXOR/FANDN to integer operations when DQI isn't supported. This is consistent with the recent changes to promote logical operations to i64 vectors. llvm-svn: 279957	2016-08-29 04:49:27 +00:00
Craig Topper	850feaf3b7	[AVX-512] Add support for selecting 512-bit VPABSB/VPABSW when BWI is available. llvm-svn: 279951	2016-08-28 22:20:51 +00:00
Craig Topper	056c9062f3	[AVX-512] Add patterns for selecting 128/256-bit EVEX VPABS instructions. llvm-svn: 279950	2016-08-28 22:20:48 +00:00
Simon Pilgrim	5369cd9e9c	[X86][AVX512] Only combine EVEX targets shuffles to shuffles of the same number of vector elements Over eager combing prevents the correct folding of writemasks. At the moment this occurs for ALL EVEX shuffles, in the future we need to check that the user of the root shuffle is a VSELECT that can fold to a writemask. llvm-svn: 279934	2016-08-28 17:27:14 +00:00
Craig Topper	abe80cc04d	[AVX-512] Promote AND/OR/XOR to v2i64/v4i64/v8i64 even when we have AVX512F/AVX512VL. Previously we weren't creating masked logical operations if bitcasts appeared between the logic operation and the select. The IR optimizers can move bitcasts across logic operations and create these cases. To minimize the number of cases we need to handle, this change promotes all logic ops to an i64 vector type just like when only SSE or AVX is available. Unfortunately, this also has the consequence of making it difficult to select unmasked VPANDD/VPORD/VPXORD in all the cases it was previously used. This is the cause of most of the test change. This shouldn't result in any functional change though. llvm-svn: 279929	2016-08-28 06:06:28 +00:00
Craig Topper	8877a026e4	[X86] Rename PABSB/D/W instructions to be consistent with SSE/AVX instructions instead of ending 128/256. NFC llvm-svn: 279927	2016-08-28 06:06:21 +00:00
Craig Topper	6943aa306e	[X86] Rename predicate function that detects if requires one of the REX.B, REX.X or REX.R bits. It's old name conflicted with a function in X8II namespace that doesnt' quite do the same thing. NFC llvm-svn: 279924	2016-08-27 17:13:43 +00:00
Craig Topper	45793a1f7a	[X86] Keep looping over operands looking for byte registers even if we already found a register that requires a REX prefix. Otherwise we don't error if a high byte register is used after SPL/BPL/DIL/SIL. llvm-svn: 279923	2016-08-27 17:13:41 +00:00
Craig Topper	6acca80e17	[X86] Include XMM/YMM/ZMM16-23 in X86II::isX86_64ExtendedReg. This feels more consistent with its name and simplifies assembler code. llvm-svn: 279922	2016-08-27 17:13:37 +00:00
Craig Topper	06c60c067f	[X86] Don't allow DR8-DR15 to be assembled in 32-bit mode. Add missing test for CR8-CR15. llvm-svn: 279921	2016-08-27 17:13:34 +00:00
Craig Topper	ed71f04abb	[X86] Remove stale comment about FixupBWInsts pass being off by default. NFC llvm-svn: 279915	2016-08-27 05:26:54 +00:00
Craig Topper	225da2cb84	[AVX-512] Allow EVEX encoding unordered/ordered/equal/notequal VCMPPS/PD/SS/SD to be commuted just like the SSE and AVX counterparts. llvm-svn: 279914	2016-08-27 05:22:15 +00:00
Craig Topper	144fdef66b	[X86] Enable FR32/FR64 cmpeq/cmpne/cmpunord/cmpord to be commuted. llvm-svn: 279913	2016-08-27 05:22:12 +00:00
Craig Topper	4891c724aa	[AVX-512] Add load folding for EVEX vcmpps/pd/ss/sd. llvm-svn: 279912	2016-08-27 05:22:08 +00:00
Simon Pilgrim	091c4c781c	[X86][SSE4A] The EXTRQ/INSERTQ bit extraction/insertion ops should be in the integer domain llvm-svn: 279811	2016-08-26 09:55:41 +00:00
Craig Topper	8f27f51192	[X86][SSE] Add CMPSS/CMPSD intrinsic scalar load folding support. llvm-svn: 279806	2016-08-26 07:08:00 +00:00
Michael Kuperstein	2ee911e985	Revert r274613 because it breaks the test suite with AVX512 This reverts most of r274613 (AKA r274626) and its follow-ups (r276347, r277289), due to miscompiles in the test suite. The FastISel change was left in, because it apparently fixes an unrelated issue. (Recommit of r279782 which was broken due to a bad merge.) This fixes 4 out of the 5 test failures in PR29112. llvm-svn: 279788	2016-08-25 22:48:11 +00:00
Michael Kuperstein	6e271f4ce8	Revert r279782 due to debug buildbot breakage. llvm-svn: 279785	2016-08-25 22:14:45 +00:00
Michael Kuperstein	a6ccc8d365	Revert r274613 because it breaks the test suite with AVX512 This reverts most of r274613 and its follow-ups (r276347, r277289), due to miscompiles in the test suite. The FastISel change was left in, because it apparently fixes an unrelated issue. This fixes 4 out of the 5 test failures in PR29112. llvm-svn: 279782	2016-08-25 21:55:41 +00:00
Michael Kuperstein	40887c5566	[X86] 512-bit VPAVG requires AVX512BW Fix VPAVG detection to require AVX512BW, not AVX512F for 512-bit widths, and change associated asserts to assert in the right direction... This fixes PR29111. llvm-svn: 279755	2016-08-25 17:17:46 +00:00
Simon Pilgrim	5aa9c203ac	[X86][SSE] INSERTPS is only combined on v4f32 types. NFCI. llvm-svn: 279751	2016-08-25 17:02:00 +00:00
Simon Pilgrim	6fe4a9ed1e	Fix line endings llvm-svn: 279745	2016-08-25 15:45:27 +00:00
Simon Pilgrim	0ad9f3e93b	[X86][AVX] Provide SubVectorBroadcast fallback if load fold fails (PR29133) Fix for PR29133, matching the approach that was taken for AVX1 scalar broadcasts. llvm-svn: 279735	2016-08-25 12:45:16 +00:00
Craig Topper	5ef7a0f45a	[X86] Simplify getOperandBias as a bit. NFC There's no reason for it to return a signed type. Just return the operand bias in each if instead of starting from 0 and adding in the 'if'. llvm-svn: 279720	2016-08-25 04:16:10 +00:00
Craig Topper	969e56a2cc	[X86] Fix indentation per coding standards. NFC llvm-svn: 279719	2016-08-25 04:16:08 +00:00
Matthias Braun	1eb473680a	MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of running after register and simply describes that no vregs are used in a machine function. With that we can simply compute the property and do not need to dump/parse it in .mir files. Differential Revision: http://reviews.llvm.org/D23850 llvm-svn: 279698	2016-08-25 01:27:13 +00:00
Simon Pilgrim	e14653e17d	[X86][SSE] Add MINSD/MAXSD/MINSS/MAXSS intrinsic scalar load folding support These are no different in load behaviour to the existing ADD/SUB/MUL/DIV scalar ops but were missing from isNonFoldablePartialRegisterLoad llvm-svn: 279652	2016-08-24 18:40:53 +00:00
Simon Pilgrim	941bd6bbae	[X86][SSE] Add support for combining VZEXT_MOVL target shuffles Includes adding more general support for the pattern: VZEXT_MOVL(VZEXT_LOAD(ptr)) -> VZEXT_LOAD(ptr) This has unearthed a couple of latent poor codegen issues (MINSS/MAXSS scalar load folding and MOVDDUP/BROADCAST load folding patterns), which will be fixed shortly. Its also reduced a couple of tests so that they no longer reach the instruction threshold necessary to be combined to PSHUFB (see PR26183). llvm-svn: 279646	2016-08-24 18:07:53 +00:00
Simon Pilgrim	7a50c8c2ba	[X86][AVX2] Ensure on 32-bit targets that we broadcast f64 types not i64 (PR29101) llvm-svn: 279622	2016-08-24 12:42:31 +00:00
Simon Pilgrim	6392b8d4ce	[X86][SSE] Add support for 32-bit element vectors to X86ISD::VZEXT_LOAD Consecutive load matching (EltsFromConsecutiveLoads) currently uses VZEXT_LOAD (load scalar into lowest element and zero uppers) for vXi64 / vXf64 vectors only. For vXi32 / vXf32 vectors it instead creates a scalar load, SCALAR_TO_VECTOR and finally VZEXT_MOVL (zero upper vector elements), relying on tablegen patterns to match this into an equivalent of VZEXT_LOAD. This patch adds the VZEXT_LOAD patterns for vXi32 / vXf32 vectors directly and updates EltsFromConsecutiveLoads to use this. This has proven necessary to allow us to easily make VZEXT_MOVL a full member of the target shuffle set - without this change the call to combineShuffle (which is the main caller of EltsFromConsecutiveLoads) tended to recursively recreate VZEXT_MOVL nodes...... Differential Revision: https://reviews.llvm.org/D23673 llvm-svn: 279619	2016-08-24 10:46:40 +00:00
Philip Reames	e83c4b30ca	[stackmaps] More extraction of common code [NFCI] General cleanup before starting to work on the part I want to actually change. llvm-svn: 279586	2016-08-23 23:33:29 +00:00
Simon Pilgrim	c8ad5c069c	[X86][AVX] Don't use SubVectorBroadcast if there are additional users of the chain (PR29088) We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload llvm-svn: 279441	2016-08-22 16:47:55 +00:00
Simon Pilgrim	13fa33012b	[X86] Only accept SM_SentinelUndef (-1) as an undefined shuffle mask in range As discussed on D23027 we should be trying to be more strict on what is an undefined mask value. llvm-svn: 279435	2016-08-22 13:18:56 +00:00
Simon Pilgrim	2279e59573	[X86][SSE] Avoid specifying unused arguments in SHUFPD lowering As discussed on PR26491, we are missing the opportunity to make use of the smaller MOVHLPS instruction because we set both arguments of a SHUFPD when using it to lower a single input shuffle. This patch sets the lowered argument to UNDEF if that shuffle element is undefined. This in turn makes it easier for target shuffle combining to decode UNDEF shuffle elements, allowing combines to MOVHLPS to occur. A fix to match against MOVHPD stores was necessary as well. This builds on the improved MOVLHPS/MOVHLPS lowering and memory folding support added in D16956 Adding similar support for SHUFPS will have to wait until have better support for target combining of binary shuffles. Differential Revision: https://reviews.llvm.org/D23027 llvm-svn: 279430	2016-08-22 12:56:54 +00:00
Craig Topper	5f8419da34	[X86] Create a new instruction format to handle 4VOp3 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279424	2016-08-22 07:38:50 +00:00
Craig Topper	9b20fece81	[X86] Create a new instruction format to handle MemOp4 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling. llvm-svn: 279423	2016-08-22 07:38:45 +00:00
Craig Topper	61b62e56b7	[X86] Space out the encodings of X86 instruction formats. I plan to add some new encodings in future commits and this will reduce the size of those commits. NFC This tries to keep all the ModRM memory and register forms in their own regions of the encodings. Hoping to make it simple on some of the switch statements that operate on these encodings. llvm-svn: 279422	2016-08-22 07:38:41 +00:00
Craig Topper	ca0eda3e6a	[X86] Merge hasVEX_i8ImmReg into the ImmFormat type which had extra unused encodings. This saves one bit in TSFlags. NFC llvm-svn: 279412	2016-08-22 01:37:19 +00:00
Craig Topper	522541231a	[X86] Remove ignoreVEX_L from TSFlags. Only the disassembler needs it and the disassembler doesn't use TSFlags. NFC llvm-svn: 279411	2016-08-22 01:37:16 +00:00
Simon Pilgrim	67e7e22462	[X86][AVX] Dropped combineShuffle256 - this can now be performed by EltsFromConsecutiveLoads llvm-svn: 279397	2016-08-21 15:39:45 +00:00
Guy Blank	9ae797a798	[AVX512][FastISel] Do not use K registers in TEST instructions In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal. Changed to using KORTEST when dealing with K regs. Differential Revision: https://reviews.llvm.org/D23163 llvm-svn: 279393	2016-08-21 08:02:27 +00:00
Simon Pilgrim	d7a3782ae4	[X86][SSE] Generalised combining to VZEXT_MOVL to any vector size This doesn't change tests codegen as we already combined to blend+zero which is what we lower VZEXT_MOVL to on SSE41+ targets, but it does put us in a better position when we improve shuffling for optsize. llvm-svn: 279273	2016-08-19 17:02:00 +00:00
Simon Pilgrim	f1b8fdc074	[X86][SSE] Add support for matching commuted insertps patterns INSERTPS doesn't fit well with our shuffle mask canonicalization, so we need to attempt both the original mask and the commuted mask to more likely get a match llvm-svn: 279230	2016-08-19 10:31:53 +00:00
Dean Michael Berris	1dd1ca9727	[XRay] Synthesize a reference to the xray_instr_map Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204	2016-08-19 04:44:30 +00:00
Andrew Kaylor	81901d658f	Include X86CallFrameOptimization in the opt-bisect process. Differential Revision: https://reviews.llvm.org/D23683 llvm-svn: 279175	2016-08-18 22:49:51 +00:00
Michael Kuperstein	2bc3d4d46c	[SelectionDAG] Rename fextend -> fpextend, fround -> fpround, frnd -> fround The names of the tablegen defs now match the names of the ISD nodes. This makes the world a slightly saner place, as previously "fround" matched ISD::FP_ROUND and not ISD::FROUND. Differential Revision: https://reviews.llvm.org/D23597 llvm-svn: 279129	2016-08-18 20:08:15 +00:00
Eugene Zelenko	61a72d8850	[LLVM] Fix some Clang-tidy modernize-use-using and Include What You Use warnings Differential revision: https://reviews.llvm.org/D23675 llvm-svn: 279102	2016-08-18 17:56:27 +00:00

1 2 3 4 5 ...

13701 Commits