llvm-project

Commit Graph

Author	SHA1	Message	Date
Marcin Koscielnicki	0275fac2c9	[X86] Extend some Linux special cases to cover kFreeBSD. Both Linux and kFreeBSD use glibc, so follow similiar code paths. Add isTargetGlibc to check for this, and use it instead of isTargetLinux in a few places. Fixes PR22248 for kFreeBSD. Differential Revision: http://reviews.llvm.org/D19104 llvm-svn: 268624	2016-05-05 11:35:51 +00:00
David Majnemer	911d0e3c21	[X86] Use the right type when folding xor (truncate (shift)) -> setcc The result type of setcc is dependent on whether or not AVX512 is present. We had an X86-specific DAG-combine which assumed that the result type should be i8 when it could be i1. This meant that we would generate illegal setccs which LowerSETCC did not like. Instead, use an appropriate type and zero extend to i8. Also, there were some scenarios where the fold should have fired but didn't because we were overly cautious about the types. This meant that we generated: shrl $31, %edi andl $1, %edi kmovw %edi, %k0 kxnorw %k0, %k0, %k1 kshiftrw $15, %k1, %k1 kxorw %k1, %k0, %k0 kmovw %k0, %eax instead of: testl %edi, %edi setns %al This fixes PR27638. llvm-svn: 268609	2016-05-05 06:00:56 +00:00
Quentin Colombet	0c5bfd0514	[X86] Add a few register classes for x32 address accesses. The new register classes allow to tell the machine verifier that it is fine to use RIP for address accesses in x32 mode. Prior to that patch, we would complain that we are using a GR64 in place of GR32, whereas it is actually fine to use GR64 for x32 as long as the 32 high bits are 0s. RIP has this property and is used for RIP-relative addressing. This partially fixes http://llvm.org/PR27481. llvm-svn: 268567	2016-05-04 22:45:31 +00:00
Simon Pilgrim	1f5ad702f8	[SelectionDAG] BITREVERSE vector legalization of bit operations (REAPPLIED) Some vector bit operations are promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use a new TLI helper isOperationLegalOrCustomOrPromote instead, allowing the SSE implementations to stay on the simd unit. Differential Revision: http://reviews.llvm.org/D19805 llvm-svn: 268561	2016-05-04 22:08:51 +00:00
Sanjay Patel	13d57b94bb	[x86] add tests to show current codegen for obscured fneg/fabs llvm-svn: 268533	2016-05-04 19:06:03 +00:00
Simon Pilgrim	1a14f0d25c	Revert r268504 llvm-svn: 268526	2016-05-04 17:49:14 +00:00
Simon Pilgrim	bc0e1d7492	[X86][SSE] Regenerate vector bswap tests llvm-svn: 268514	2016-05-04 15:45:48 +00:00
Simon Pilgrim	b97c06210b	[SelectionDAG] BITREVERSE vector legalization of bit operations Vector bit operations are typically promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use isOperationLegalOrPromote instead, allowing the SSE implementations to stay on the simd unit. Differential Revision: http://reviews.llvm.org/D19805 llvm-svn: 268504	2016-05-04 15:01:13 +00:00
Elena Demikhovsky	24aba1ca38	The test files are auto-generated by update_llc_test_checks.py utility. No functional changes. llvm-svn: 268498	2016-05-04 14:31:18 +00:00
David Majnemer	2c5aeabedd	[X86] Lower zext i1 arguments i1 is now a legal type for X86 with AVX512. There were some paths in X86FastISel which were not quite ready to see an i1 value: they were not quite sure how to deal with sign/zero extends for call arguments. DTRT by extending to i8 for zeroext and bailing out of FastISel for signext. This fixes PR27591. llvm-svn: 268470	2016-05-04 00:22:23 +00:00
Simon Pilgrim	fb1766ad68	[X86][XOP] Add placeholder VPERMIL2 combining tests llvm-svn: 268450	2016-05-03 21:55:37 +00:00
Tim Northover	d2ecbccf27	X86-Darwin: start emitting data-region directives for jump-tables. The surrounding tools can cope these days, and they were invented for a reason. llvm-svn: 268437	2016-05-03 21:03:41 +00:00
Quentin Colombet	26dab3a485	[ImplicitNullChecks] Account for implicit-defs as well when updating the liveness. The replaced load may have implicit-defs and those defs may be used in the block of the original load. Make sure to update the liveness accordingly. This is a generalization of r267817. llvm-svn: 268412	2016-05-03 18:09:06 +00:00
Simon Pilgrim	d2752708a3	[X86][SSE] Added target shuffle combine to MOVQ llvm-svn: 268391	2016-05-03 15:05:13 +00:00
Simon Pilgrim	32e78c3ff7	[X86][SSSE3] Missing combine opportunity to simplify to a MOVQ shuffle llvm-svn: 268378	2016-05-03 13:12:44 +00:00
Igor Breger	58c07806ae	[AVX512] Add support for commutative MAX/MIN . In general VMAX{PS,PD} and VMIN{PS,PD} instruction are not commutative . In combine pass only if UnsafeFPMath are used VMAX/VMAX are converted to commutative nodes VMAXC/VMAXC. Differential Revision: http://reviews.llvm.org/D19860 llvm-svn: 268375	2016-05-03 11:51:45 +00:00
Igor Breger	ab076c683c	[AVX512] Fix lowerV4X128VectorShuffle to select correctly input operands . Differential Revision: http://reviews.llvm.org/D19803 llvm-svn: 268368	2016-05-03 08:08:44 +00:00
Quentin Colombet	776e6de516	[MachineBlockPlacement] Let the target optimize the branches at the end. After the layout of the basic blocks is set, the target may be able to get rid of unconditional branches to fallthrough blocks that the generic code does not catch. This happens any time TargetInstrInfo::AnalyzeBranch is not able to analyze all the branches involved in the terminators sequence, while still understanding a few of them. In such situation, AnalyzeBranch can directly modify the branches if it has been instructed to do so. This patch takes advantage of that. llvm-svn: 268328	2016-05-02 22:58:59 +00:00
Quentin Colombet	4e1d389ac5	[X86] Model FAULTING_LOAD_OP as a terminator and branch. This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327	2016-05-02 22:58:54 +00:00
Simon Pilgrim	21b2c5660e	[X86][AVX2] Added 128-bit wide shuffle test Demonstrate missing 128-bit wide shuffle combine support llvm-svn: 268290	2016-05-02 19:46:58 +00:00
David L Kreitzer	0fe4632bd7	Enable the X86 call frame optimization for the 64-bit targets that allow it. Fixes PR27241. Differential Revision: http://reviews.llvm.org/D19688 llvm-svn: 268227	2016-05-02 13:45:25 +00:00
Craig Topper	b6da65403a	[AVX512] VPACKUSWB/VPACKSSWB should not be encoded with EVEX.W=1. While there fix the execution domain for VPACKSSDW/VPACKUSDW. llvm-svn: 268200	2016-05-01 17:38:32 +00:00
Igor Breger	110af565c7	getelementptr instruction, support index vector of EVT. Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195	2016-05-01 13:29:12 +00:00
Igor Breger	131008fbcb	Change AVX512 braodcastsd/ss patterns interaction with spilling . New implementation take a scalar register and generate a vector without COPY_TO_REGCLASS (turn it into a VR128 register ) .The issue is that during register allocation we may spill a scalar value using 128-bit loads and stores, wasting cache bandwidth. Differential Revision: http://reviews.llvm.org/D19579 llvm-svn: 268190	2016-05-01 08:40:00 +00:00
Craig Topper	e430de8be6	[AVX512] Prefer AVX512 VPACK instructions over AVX/AVX2 instructions when VLX and BWI are supported. llvm-svn: 268189	2016-05-01 06:52:19 +00:00
Haicheng Wu	4afe0425db	[MBP] Use Function::optForSize() instead of checking OptimizeForSize directly. Fix a FIXME. Disable loop alignment if compiled with -Oz now. llvm-svn: 268121	2016-04-29 22:01:10 +00:00
Sriraman Tallam	7da9b445ea	Differential Revision: http://reviews.llvm.org/D19733 llvm-svn: 268106	2016-04-29 21:19:16 +00:00
Matt Arsenault	ab2232cf73	DAGCombiner: Reduce truncated shl width llvm-svn: 268094	2016-04-29 19:53:16 +00:00
Mitch Bodart	e60465ddf7	[X86] Enable the post-RA-scheduler for clang's default 32-bit cpu. For compilations with no explicit cpu specified, this exhibits nice gains on Silvermont, with neutral performance on big cores. Differential Revision: http://reviews.llvm.org/D19138 llvm-svn: 267809	2016-04-27 22:52:35 +00:00
Quentin Colombet	bf200688de	[X86][FastISel] Make sure we use the right register class when we select stores. llvm-svn: 267806	2016-04-27 22:33:42 +00:00
Quentin Colombet	d6dbec4c6f	[X86] Fix the lowering of TLS calls. The callseq_end node must be glued with the TLS calls, otherwise, the generic code will miss the uses of the returned value and will mark it dead. Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI, the pseudo uses the symbol address at this point not RDI and the lowering will do the right thing. llvm-svn: 267797	2016-04-27 21:37:37 +00:00
Kevin B. Smith	c378a99ba5	[X86]: Quit promoting 16 bit loads to 32 bit. Differential Revision: http://reviews.llvm.org/D19592 llvm-svn: 267773	2016-04-27 19:58:03 +00:00
Nico Weber	e69b9548b8	Revert r267649, it caused PR27539. llvm-svn: 267723	2016-04-27 15:16:54 +00:00
Ahmed Bougacha	19a2ee591a	[X86] Don't assume that MMX extractelts are from index 0. It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652	2016-04-27 01:35:29 +00:00
Ahmed Bougacha	e68363a03c	[X86] Re-enable MMX i32 extractelt combine. This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651	2016-04-27 01:35:25 +00:00
Cong Hou	6f879d9eb1	Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649	2016-04-27 01:29:18 +00:00
Quentin Colombet	4ff3cfb673	[X86] Make sure it is safe to clobber EFLAGS, if need be, when choosing the prologue. Do not use basic blocks that have EFLAGS live-in as prologue if we need to realign the stack. Realigning the stack uses AND instruction and this clobbers EFLAGS. An other alternative would have been to save and restore EFLAGS around the stack realignment code, but this is likely inefficient. Fixes PR27531. llvm-svn: 267634	2016-04-26 23:44:14 +00:00
Mitch Bodart	807e13379b	[X86] Replace -mcpu with -mattr in several tests Differential Revision: http://reviews.llvm.org/D19568 llvm-svn: 267629	2016-04-26 23:36:38 +00:00
Quentin Colombet	08e79990a0	[MachineBasicBlock] Take advantage of the partially dead information. Thanks to that information we wouldn't lie on a register being live whereas it is not. llvm-svn: 267622	2016-04-26 23:14:29 +00:00
Quentin Colombet	3f19245015	[MachineInstrBundle] Improvement the recognition of dead definitions. Now, it is possible to know that partial definitions are dead definitions and recognize that clobbered registers are also dead. llvm-svn: 267621	2016-04-26 23:14:24 +00:00
Manman Ren	1c3f65a18c	Swift Calling Convention: use %RAX for sret. We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579	2016-04-26 18:08:06 +00:00
Sanjay Patel	d66607bd8c	[CodeGenPrepare] use branch weight metadata to decide if a select should be turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572	2016-04-26 17:11:17 +00:00
Andrey Turetskiy	b405606432	[X86] PR27502: Fix the LEA optimization pass. Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass. Differential Revision: http://reviews.llvm.org/D19409 llvm-svn: 267551	2016-04-26 12:18:12 +00:00
Ahmed Bougacha	5cf735a5b1	[X86] Use LivePhysRegs in X86FixupBWInsts. Kill-flags, which computeRegisterLiveness uses, are not reliable. LivePhysRegs is. Differential Revision: http://reviews.llvm.org/D19472 llvm-svn: 267495	2016-04-26 00:00:48 +00:00
Sanjay Patel	43c7af6889	add tests for potential CGP transform (PR27344) llvm-svn: 267426	2016-04-25 16:56:52 +00:00
Sanjay Patel	0fc4137065	[x86] auto-generate checks for cmov tests llvm-svn: 267417	2016-04-25 15:26:57 +00:00
David Majnemer	dd21523653	[WinEH] Update SplitAnalysis::computeLastSplitPoint to cope with multiple EH successors We didn't have logic to correctly handle CFGs where there was more than one EH-pad successor (these are novel with WinEH). There were situations where a register was live in one exceptional successor but not another but the code as written would only consider the first exceptional successor it found. This resulted in split points which were insufficiently early if an invoke was present. This fixes PR27501. N.B. This removes getLandingPadSuccessor. llvm-svn: 267412	2016-04-25 14:31:32 +00:00
Michael Zuckerman	1bd66dd1c2	Fixing wrong mask size error. From __mmask8 to __mmask16. Was reviewed over the shoulder by AsafBadouh. Connected to review http://reviews.llvm.org/D19195. llvm-svn: 267379	2016-04-25 05:27:51 +00:00
Craig Topper	61a14911b2	[X86] Add a complete set of tests for all operand sizes of cttz/ctlz with and without zero undef being lowered to bsf/bsr. llvm-svn: 267373	2016-04-25 01:01:15 +00:00
Simon Pilgrim	646c2a5569	[X86][AVX] Added PR24935 test case llvm-svn: 267362	2016-04-24 20:30:48 +00:00

1 2 3 4 5 ...

7376 Commits