llvm-project

Commit Graph

Author	SHA1	Message	Date
Elena Demikhovsky	d2cb3c8876	AVX-512: Fixed the "test" operation for i1 type Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits. KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits. I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB. There are some cases where i1 is in the mask register and the upper bits are already zeroed. Then KORTESTW is the better solution, but it is subject for optimization. Meanwhile, I'm fixing the correctness issue. llvm-svn: 228916	2015-02-12 08:40:34 +00:00
Michael Kuperstein	db95d04be4	[X86] A heuristic to estimate the size impact for converting stack-relative parameter movs to pushes This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size. We go over all calls in the MachineFunction and compute: a) For each callsite that can not use pushes, the penalty of not having a reserved call frame. b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack). Differential Revision: http://reviews.llvm.org/D7561 llvm-svn: 228915	2015-02-12 08:36:35 +00:00
Ahmed Bougacha	24433a7005	[CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x). We used to do this DAG combine, but it's not always correct: If the first fp_round isn't a value preserving truncation, it might introduce a tie in the second fp_round, that wouldn't occur in the single-step fp_round we want to fold to. In other words, double rounding isn't the same as rounding. Differential Revision: http://reviews.llvm.org/D7571 llvm-svn: 228911	2015-02-12 06:15:29 +00:00
Simon Pilgrim	2a9a745328	[X86][SSE] Added dual vector truncation tests. llvm-svn: 228857	2015-02-11 18:14:35 +00:00
Sanjay Patel	afe251649b	fixed to test features, not CPUs llvm-svn: 228836	2015-02-11 15:00:41 +00:00
Sanjay Patel	b53d82cbc5	fixed to test features, not CPUs llvm-svn: 228835	2015-02-11 15:00:19 +00:00
Sanjay Patel	8b88bc91bd	fixed to test features, not CPUs llvm-svn: 228834	2015-02-11 14:58:25 +00:00
David Majnemer	ca19485f08	X86: @llvm.frameaddress should defer to SelectionDAG for Win CFI llvm-svn: 228754	2015-02-10 22:00:34 +00:00
David Majnemer	13d0b11d7b	X86: Make @llvm.frameaddress work correctly with Windows unwind codes Simply loading or storing the frame pointer is not sufficient for Windows targets. Instead, create a synthetic frame object that we will lower later. References to this synthetic object will be replaced with the correct reference to the frame address. llvm-svn: 228748	2015-02-10 21:22:05 +00:00
David Majnemer	a7d908eb2b	X86: Emit Win64 SaveXMM opcodes at the right offset in the right order Walk the instructions marked FrameSetup and consider any stores of XMM registers to the stack as needing a SaveXMM opcode. This fixes PR22521. Differential Revision: http://reviews.llvm.org/D7527 llvm-svn: 228724	2015-02-10 19:01:47 +00:00
Paul Robinson	848cf6aa3a	Explicitly initialize a flag in a default constructor. Works around a Visual C++ issue. Patch by Douglas Yung! llvm-svn: 228699	2015-02-10 15:30:02 +00:00
Simon Pilgrim	d142ab7d08	[X86][AVX2] Missing AVX2 memory folding instructions Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns) Differential Revision: http://reviews.llvm.org/D7492 llvm-svn: 228688	2015-02-10 13:22:57 +00:00
Simon Pilgrim	cd32254a35	[X86][XOP] Added XOP memory folding patterns + tests This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc. Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources. Differential Revision: http://reviews.llvm.org/D7484 llvm-svn: 228685	2015-02-10 12:57:17 +00:00
Andrea Di Biagio	62622d2396	[X86][FastIsel] Avoid introducing legacy SSE instructions if the target has AVX. This patch teaches X86FastISel how to select AVX instructions for scalar float/double convert operations. Before this patch, X86FastISel always selected legacy SSE instructions for FPExt (from float to double) and FPTrunc (from double to float). For example: \code define double @foo(float %f) { %conv = fpext float %f to double ret double %conv } \end code Before (with -mattr=+avx -fast-isel) X86FastIsel selected a CVTSS2SDrr which is legacy SSE: cvtss2sd %xmm0, %xmm0 With this patch, X86FastIsel selects a VCVTSS2SDrr instead: vcvtss2sd %xmm0, %xmm0, %xmm0 Added test fast-isel-fptrunc-fpext.ll to check both the register-register and the register-memory float/double conversion variants. Differential Revision: http://reviews.llvm.org/D7438 llvm-svn: 228682	2015-02-10 12:04:41 +00:00
Nick Lewycky	1cbc13a928	Remove non-test files that appear to have been accidentally committed in r228641. llvm-svn: 228657	2015-02-10 02:39:17 +00:00
Chandler Carruth	b65d61a2e8	[x86] Fix PR22524: the DAG combiner was incorrectly handling illegal nodes when folding bitcasts of constants. We can't fold things and then check after-the-fact whether it was legal. Once we have formed the DAG node, arbitrary other nodes may have been collapsed to it. There is no easy way to go back. Instead, we need to test for the specific folding cases we're interested in and ensure those are legal first. This could in theory make this less powerful for bitcasting from an integer to some vector type, but AFAICT, that can't actually happen in the SDAG so its fine. Now, we only whitelist specific int->fp and fp->int bitcasts for post-legalization folding. I've added the test case from the PR. (Also as a note, this does not appear to be in 3.6, no backport needed) llvm-svn: 228656	2015-02-10 02:25:56 +00:00
David Majnemer	93c22a45be	X86: Emit an ABI compliant prologue and epilogue for Win64 Win64 has specific contraints on what valid prologues and epilogues look like. This constraint is born from the flexibility and descriptiveness of Win64's unwind opcodes. Prologues previously emitted by LLVM could not be represented by the unwind opcodes, preventing operations powered by stack unwinding to successfully work. Differential Revision: http://reviews.llvm.org/D7520 llvm-svn: 228641	2015-02-10 00:57:42 +00:00
Sanjay Patel	546f26acf3	fixed to test features, not CPUs llvm-svn: 228581	2015-02-09 17:17:09 +00:00
Sanjay Patel	baf0a2415c	fix test attributes; this is an SSE2 test, not a Nehalem test llvm-svn: 228546	2015-02-08 21:14:27 +00:00
Sanjay Patel	e6eed52325	fix test attributes; this is an x86-64 test, not a Nehalem test llvm-svn: 228545	2015-02-08 21:10:40 +00:00
Sanjay Patel	5cf03374a1	fix test attributes; these are SSE2 tests, not Nehalem tests llvm-svn: 228544	2015-02-08 21:05:03 +00:00
Sanjay Patel	9be09a3617	fix test attributes; these are SSE2 tests, not Nehalem tests llvm-svn: 228541	2015-02-08 20:50:58 +00:00
Sanjay Patel	ff9dec22ba	fix test attributes; these are x86-64 tests, not Nehalem tests llvm-svn: 228536	2015-02-08 20:05:53 +00:00
Sanjay Patel	972425d221	fix test attributes; these are MMX tests, not Nehalem tests llvm-svn: 228535	2015-02-08 20:01:12 +00:00
Sanjay Patel	c871fe746b	fix test attributes; these are SSE2 tests, not Nehalem tests llvm-svn: 228534	2015-02-08 19:50:55 +00:00
Sanjay Patel	3fa03da6f8	generalize test; nothing Nehalem-specific here llvm-svn: 228532	2015-02-08 19:38:25 +00:00
Simon Pilgrim	e490385843	[X86][AVX2] AVX2 broadcast + permute memory folding tests. llvm-svn: 228528	2015-02-08 18:33:13 +00:00
Simon Pilgrim	7440699267	[X86][AVX2] AVX2 integer stack folding tests. This adds tests for the remaining AVX2 instructions that currently support memory folding. llvm-svn: 228513	2015-02-07 23:28:16 +00:00
Simon Pilgrim	a2618679a8	[X86][AVX] Added missing stack folding support + test for vptest ymm instruction llvm-svn: 228509	2015-02-07 21:44:06 +00:00
Simon Pilgrim	cbc3b2fdc2	[X86][SSE] Added missing stack folding tests for (v)mpsadbw instruction llvm-svn: 228506	2015-02-07 21:20:11 +00:00
Simon Pilgrim	0238b96c06	[X86] Force fp stack folding tests to keep to specific domain. General boolean instructions (AND, ANDN, OR, XOR) need to use a specific domain instruction (and not just the default). llvm-svn: 228495	2015-02-07 16:14:55 +00:00
Simon Pilgrim	947ce78d49	[X86][AVX2] More AVX2 integer stack folding tests. llvm-svn: 228494	2015-02-07 16:07:27 +00:00
David Majnemer	5614ea9aae	MC: Emit COFF section flags in the "proper" order COFF section flags are not idempotent: 'rd' will make a read-write section because 'd' implies write 'dr' will make a read-only section because 'r' disables write llvm-svn: 228490	2015-02-07 08:26:40 +00:00
Simon Pilgrim	76cb85a6c7	[X86][AVX2] Begun adding AVX2 integer stack folding tests. llvm-svn: 228462	2015-02-06 23:12:15 +00:00
Reid Kleckner	526ec29370	Don't dllexport declarations Fixes PR22488 llvm-svn: 228411	2015-02-06 17:59:49 +00:00
Matthias Braun	5cfba2f573	X86: Test cleanup Use FileCheck, make it more consistent and do not rely on unoptimized or(cmp,cmp) getting combined for max to be matched. llvm-svn: 228361	2015-02-05 23:52:12 +00:00
Ahmed Bougacha	e892d13d90	[CodeGen] Add hook/combine to form vector extloads, enabled on X86. The combine that forms extloads used to be disabled on vector types, because "None of the supported targets knows how to perform load and sign extend on vectors in one instruction." That's not entirely true, since at least SSE4.1 X86 knows how to do those sextloads/zextloads (with PMOVS/ZX). But there are several aspects to getting this right. First, vector extloads are controlled by a profitability callback. For instance, on ARM, several instructions have folded extload forms, so it's not always beneficial to create an extload node (and trying to match extloads is a whole 'nother can of worms). The interesting optimization enables folding of s/zextloads to illegal (splittable) vector types, expanding them into smaller legal extloads. It's not ideal (it introduces some legalization-like behavior in the combine) but it's better than the obvious alternative: form illegal extloads, and later try to split them up. If you do that, you might generate extloads that can't be split up, but have a valid ext+load expansion. At vector-op legalization time, it's too late to generate this kind of code, so you end up forced to scalarize. It's better to just avoid creating egregiously illegal nodes. This optimization is enabled unconditionally on X86. Note that the splitting combine is happy with "custom" extloads. As is, this bypasses the actual custom lowering, and just unrolls the extload. But from what I've seen, this is still much better than the current custom lowering, which does some kind of unrolling at the end anyway (see for instance load_sext_4i8_to_4i64 on SSE2, and the added FIXME). Also note that the existing combine that forms extloads is now also enabled on legal vectors. This doesn't have a big effect on X86 (because sext+load is usually combined to sext_inreg+aextload). On ARM it fires on some rare occasions; that's for a separate commit. Differential Revision: http://reviews.llvm.org/D6904 llvm-svn: 228325	2015-02-05 18:31:02 +00:00
Andrew Trick	7fc4583eda	X86 ABI fix for return values > 24 bytes. The return value's address must be returned in %rax. i.e. the callee needs to copy the sret argument (%rdi) into the return value (%rax). This probably won't manifest as a bug when the caller is LLVM-compiled code. But it is an ABI guarantee and tools expect it. llvm-svn: 228321	2015-02-05 18:09:05 +00:00
Bruno Cardoso Lopes	ab9ae87623	[X86][MMX] Handle i32->mmx conversion using movd Implement a BITCAST dag combine to transform i32->mmx conversion patterns into a X86 specific node (MMX_MOVW2D) and guarantee that moves between i32 and x86mmx are better handled, i.e., don't use store-load to do the conversion.. llvm-svn: 228293	2015-02-05 13:23:07 +00:00
Bruno Cardoso Lopes	cc6089d2e0	[X86][MMX] Add several bitcast tests Avoid regression in previously supported MMX code by adding different combinations of tests which exercise MMX bitcasts. Small improvements to these patterns should come next. llvm-svn: 228292	2015-02-05 13:22:57 +00:00
Rafael Espindola	a092f17580	Don' try to make sections in comdats SHF_MERGE. Parts of llvm were not expecting it and we wouldn't print the entity size of the section. Given what comdats are used for, having SHF_MERGE sections would be just a small improvement, so just disable it for now. Fixes pr22463. llvm-svn: 228196	2015-02-04 21:27:24 +00:00
Michael Kuperstein	cd63c5fa73	Fixes a bug in vector load legalization that confused bits and bytes. Differential Revision: http://reviews.llvm.org/D7400 llvm-svn: 228168	2015-02-04 18:54:01 +00:00
Chandler Carruth	4d31f58c88	[x86] Give movss and movsd execution domains in the x86 backend. This associates movss and movsd with the packed single and packed double execution domains (resp.). While this is largely cosmetic, as we now don't have weird ping-pong-ing between single and double precision, it is also useful because it avoids the domain fixing algorithm from seeing domain breaks that don't actually exist. It will also be much more important if we have an execution domain default other than packed single, as that would cause us to mix movss and movsd with integer vector code on a regular basis, a very bad mixture. llvm-svn: 228135	2015-02-04 10:58:53 +00:00
Chandler Carruth	78c8dcd9d3	[x86] Remove a low-value test that was just checking how we cleared a register. We have lots of tests covering this. llvm-svn: 228133	2015-02-04 10:47:34 +00:00
Chandler Carruth	bb525e336b	[x86] Mechanically update a bunch of tests' check lines using the latest version of the script. Changes include: - Using the VEX prefix - Skipping more detail when we have useful shuffle comments to match - Matching more shuffle comments that have been added to the printer (yay!) - Matching the destination registers of some AVX instructions - Stripping trailing whitespace that crept in - Fixing indentation issues Nothing interesting going on here. I'm just trying really hard to ensure these changes don't show up in the diffs with actual changes to the backend. llvm-svn: 228132	2015-02-04 10:46:53 +00:00
Chandler Carruth	22b1525ae8	[x86] Include the destination register in the check-lines for AVX instructions. No actual change here. llvm-svn: 228127	2015-02-04 09:18:27 +00:00
Chandler Carruth	18ba596609	[x86] Add some tests I missed in the prior commit to cover blends with zero for v8i16 as well. These exhibit the same domain badness, but also exhibit other weaknesses in our blend lowering. More fixes to come. llvm-svn: 228126	2015-02-04 09:15:46 +00:00
Chandler Carruth	024cf8efd7	[x86] Start to introduce bit-masking based blend lowering. This is the simplest form of bit-math based blending which only fires when we are blending with zero and is relatively profitable. I've only enabled this path on very specific lowering strategies. I'm planning to widen its applicability in subsequent patches, but so far you'll notice that even though we get fewer shufps instructions, we still do the bit math in the FP execution port. I'm looking into why this is still happening. llvm-svn: 228124	2015-02-04 09:06:05 +00:00
Chandler Carruth	872d80e7a4	[x86] Add tests for blends-with-zero on 4-element vectors. llvm-svn: 228122	2015-02-04 09:05:58 +00:00
Chandler Carruth	abd09a1f35	[x86] Refresh the checks of a number of tests using update_llc_test_checks.py. The exact format of the checks has changed over time. This includes different indenting rules, new shuffle comments that have been added, and more operand hiding behind regular expressions. No functional change to the tests are expected here, but this will make subsequent patches have a clean diff as they change shuffle lowering. llvm-svn: 228097	2015-02-04 00:58:42 +00:00

1 2 3 4 5 ...

5584 Commits