llvm-project

Commit Graph

Author	SHA1	Message	Date
Matthias Braun	165d467125	MachineCopyPropagation: Remove the copies instead of using KILL instructions. For some history here see the commit messages of r199797 and r169060. The original intent was to fix cases like: %EAX<def> = COPY %ECX<kill>, %RAX<imp-def> %RCX<def> = COPY %RAX<kill> where simply removing the copies would have RCX undefined as in terms of machine operands only the ECX part of it is defined. The machine verifier would complain about this so 169060 changed such COPY instructions into KILL instructions so some super-register imp-defs would be preserved. In r199797 it was finally decided to always do this regardless of super-register defs. But this is wrong, consider: R1 = COPY R0 ... R0 = COPY R1 getting changed to: R1 = KILL R0 ... R0 = KILL R1 It now looks like R0 dies at the first KILL and won't be alive until the second KILL, while in reality R0 is alive and must not change in this part of the program. As this only happens after register allocation there is not much code still performing liveness queries so the issue was not noticed. In fact I didn't manage to create a testcase for this, without unrelated changes I am working on at the moment. The fix is simple: As of r223896 the MachineVerifier allows reads from partially defined registers, so the whole transforming COPY->KILL thing is not necessary anymore. This patch also changes a similar (but more benign case as the def and src are the same register) case in the VirtRegRewriter. Differential Revision: http://reviews.llvm.org/D10117 llvm-svn: 238588	2015-05-29 18:19:25 +00:00
Simon Pilgrim	aedd3c5160	line endings fix llvm-svn: 235800	2015-04-25 12:12:43 +00:00
David Blaikie	a79ac14fa6	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794	2015-02-27 21:17:42 +00:00
Simon Pilgrim	d8820ae70c	Reapplied D7816 & rL230177 & rL230278 - with an additional fix toensure that the smallest build vector input scalar type is always used. Additional (crash) test cases already committed. llvm-svn: 230388	2015-02-24 22:08:56 +00:00
Eric Christopher	af48495130	Revert: Author: Simon Pilgrim <llvm-dev@redking.me.uk> Date: Mon Feb 23 23:04:28 2015 +0000 Fix based on post-commit comment on D7816 & rL230177 - BUILD_VECTOR operand truncation was using the the BV's output scalar type instead of the input type. and Author: Simon Pilgrim <llvm-dev@redking.me.uk> Date: Sun Feb 22 18:17:28 2015 +0000 [DagCombiner] Generalized BuildVector Vector Concatenation The CONCAT_VECTORS combiner pass can transform the concat of two BUILD_VECTOR nodes into a single BUILD_VECTOR node. This patch generalises this to support any number of BUILD_VECTOR nodes, and also permits UNDEF nodes to be included as well. This was noticed as AVX vec128 -> vec256 canonicalization sometimes creates a CONCAT_VECTOR with a real vec128 lower and an vec128 UNDEF upper. Differential Revision: http://reviews.llvm.org/D7816 as the root cause of PR22678 which is causing an assertion inside the DAG combiner. I'll follow up to the main thread as well. llvm-svn: 230358	2015-02-24 19:11:00 +00:00
Simon Pilgrim	4e30d9b6d8	[DagCombiner] Generalized BuildVector Vector Concatenation The CONCAT_VECTORS combiner pass can transform the concat of two BUILD_VECTOR nodes into a single BUILD_VECTOR node. This patch generalises this to support any number of BUILD_VECTOR nodes, and also permits UNDEF nodes to be included as well. This was noticed as AVX vec128 -> vec256 canonicalization sometimes creates a CONCAT_VECTOR with a real vec128 lower and an vec128 UNDEF upper. Differential Revision: http://reviews.llvm.org/D7816 llvm-svn: 230177	2015-02-22 18:17:28 +00:00
Simon Pilgrim	fccc3ab741	[X86][SSE] Added shuffle based integer zero extension tests. llvm-svn: 230145	2015-02-21 21:25:16 +00:00
Michael Kuperstein	ff5acaf50c	[X86] Combine vector anyext + and into a vector zext Vector zext tends to get legalized into a vector anyext, represented as a vector shuffle with an undef vector + a bitcast, that gets ANDed with a mask that zeroes the undef elements. Combine this into an explicit shuffle with a zero vector instead. This allows shuffle lowering to match it as a zext, instead of matching it as an anyext and emitting an explicit AND. This combine only covers a subset of the cases, but it's a start. Differential Revision: http://reviews.llvm.org/D7666 llvm-svn: 229480	2015-02-17 08:22:51 +00:00
Chandler Carruth	1c60d18aee	[x86] Update some tests with the latest version of my script and llc. This mostly adds some shuffle decode comments and cleans up indentation. llvm-svn: 229296	2015-02-15 09:26:15 +00:00
Ahmed Bougacha	e892d13d90	[CodeGen] Add hook/combine to form vector extloads, enabled on X86. The combine that forms extloads used to be disabled on vector types, because "None of the supported targets knows how to perform load and sign extend on vectors in one instruction." That's not entirely true, since at least SSE4.1 X86 knows how to do those sextloads/zextloads (with PMOVS/ZX). But there are several aspects to getting this right. First, vector extloads are controlled by a profitability callback. For instance, on ARM, several instructions have folded extload forms, so it's not always beneficial to create an extload node (and trying to match extloads is a whole 'nother can of worms). The interesting optimization enables folding of s/zextloads to illegal (splittable) vector types, expanding them into smaller legal extloads. It's not ideal (it introduces some legalization-like behavior in the combine) but it's better than the obvious alternative: form illegal extloads, and later try to split them up. If you do that, you might generate extloads that can't be split up, but have a valid ext+load expansion. At vector-op legalization time, it's too late to generate this kind of code, so you end up forced to scalarize. It's better to just avoid creating egregiously illegal nodes. This optimization is enabled unconditionally on X86. Note that the splitting combine is happy with "custom" extloads. As is, this bypasses the actual custom lowering, and just unrolls the extload. But from what I've seen, this is still much better than the current custom lowering, which does some kind of unrolling at the end anyway (see for instance load_sext_4i8_to_4i64 on SSE2, and the added FIXME). Also note that the existing combine that forms extloads is now also enabled on legal vectors. This doesn't have a big effect on X86 (because sext+load is usually combined to sext_inreg+aextload). On ARM it fires on some rare occasions; that's for a separate commit. Differential Revision: http://reviews.llvm.org/D6904 llvm-svn: 228325	2015-02-05 18:31:02 +00:00
Ahmed Bougacha	2d80ea1939	[X86] Cleanup tabs in test vector-zext.ll. NFC. Some tests have tabs, some don't. In vector-[sz]ext.ll, space wins (well duh!). llvm-svn: 227615	2015-01-30 21:41:28 +00:00
Craig Topper	0271d10d35	[x86] Change u8imm operands to always print as unsigned. This makes shuffle masks and the like make way more sense. llvm-svn: 226902	2015-01-23 08:00:59 +00:00
Ahmed Bougacha	8b54286d1c	[X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns. Most patterns will go away once the extload legalization changes land. Differential Revision: http://reviews.llvm.org/D6125 llvm-svn: 223567	2014-12-06 01:31:07 +00:00
Chandler Carruth	0c922fcec5	[x86] Start improving the matching of unpck instructions based on test cases from Halide folks. This initial step was extracted from a prototype change by Clay Wood to try and address regressions found with Halide and the new vector shuffle lowering. llvm-svn: 221779	2014-11-12 10:05:18 +00:00
Chandler Carruth	ce6947d4cf	[x86] Clean up a bunch of vector shuffle tests with my script. Notably, removes windows line endings and other noise. This is in prelude to making substantive changes to these tests. llvm-svn: 221776	2014-11-12 09:17:15 +00:00
Chandler Carruth	acecdc0211	[x86] Fix PR21139, one of the last remaining regressions found in the new vector shuffle lowering. This is loosely based on a patch by Marius Wachtler to the PR (thanks!). I refactored it a bi to use std::count_if and a mutable array ref but the core idea was exactly right. I also added some direct testing of this case. I believe PR21137 is now the only remaining regression. llvm-svn: 219081	2014-10-05 12:07:34 +00:00
Chandler Carruth	99627bfbff	[x86] Enable the new vector shuffle lowering by default. Update the entire regression test suite for the new shuffles. Remove most of the old testing which was devoted to the old shuffle lowering path and is no longer relevant really. Also remove a few other random tests that only really exercised shuffles and only incidently or without any interesting aspects to them. Benchmarking that I have done shows a few small regressions with this on LNT, zero measurable regressions on real, large applications, and for several benchmarks where the loop vectorizer fires in the hot path it shows 5% to 40% improvements for SSE2 and SSE3 code running on Sandy Bridge machines. Running on AMD machines shows even more dramatic improvements. When using newer ISA vector extensions the gains are much more modest, but the code is still better on the whole. There are a few regressions being tracked (PR21137, PR21138, PR21139) but by and large this is expected to be a win for x86 generated code performance. It is also more correct than the code it replaces. I have fuzz tested this extensively with ISA extensions up through AVX2 and found no crashes or miscompiles (yet...). The old lowering had a few miscompiles and crashers after a somewhat smaller amount of fuzz testing. There is one significant area where the new code path lags behind and that is in AVX-512 support. However, there was extremely little support for that already and so this isn't a significant step backwards and the new framework will probably make it easier to implement lowering that uses the full power of AVX-512's table-based shuffle+blend (IMO). Many thanks to Quentin, Andrea, Robert, and others for benchmarking assistance. Thanks to Adam and others for help with AVX-512. Thanks to Hal, Eric, and many others for answering my incessant questions about how the backend actually works. =] I will leave the old code path in the tree until the 3 PRs above are at least resolved to folks' satisfaction. Then I will rip it (and 1000s of lines of code) out. =] I don't expect this flag to stay around for very long. It may not survive next week. llvm-svn: 219046	2014-10-04 03:52:55 +00:00
Chandler Carruth	c1bb0e84bc	[x86] Switch some of the new consolidated vector tests to use a bare-metal triple and have nice BB labels, etc. No significant change here, just tidying up to have a consistent set of OS-agnostic vector functionality here. llvm-svn: 218854	2014-10-02 06:52:19 +00:00
Chandler Carruth	bbbdb9f0ee	[x86] Teach both sext and zext vector tests to cover a nice wide range of architectures: SSE2, SSSE3, SSE4.1, AVX, and AVX2. Unfortunately, this exposses the absolute horror of the code we generate for many of these patterns. Anyone wanting to familiarize themselves with the x86 backend and improve performance could do a lot of good sitting down and making these test cases not look so terrible. While the new vector shuffle code I'm working on well help some, it won't fix all of the crimes here. llvm-svn: 218807	2014-10-01 20:41:36 +00:00
Chandler Carruth	c66ea0fc12	[x86] Rename avx-{s,z}ext.ll to vector-{s,z}ext.ll. These tests are far and away the best sext and zext tests we have for vectors. I'm going to merge the other similar tests into them and expand the ISA coverage. llvm-svn: 218800	2014-10-01 20:30:30 +00:00

20 Commits