llvm-project

History

Sanjay Patel a0d8a278a7 [x86] use a single shufps when it can save instructions This is a tiny patch with a big pile of test changes. This partially fixes PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 My motivating case looks like this: - vpshufd {{.#+}} xmm1 = xmm1[0,1,0,2] - vpshufd {{.#+}} xmm0 = xmm0[0,2,2,3] - vpblendw {{.#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7] + vshufps {{.#+}} xmm0 = xmm0[0,2],xmm1[0,2] And this happens several times in the diffs. For chips with domain-crossing penalties, the instruction count and size reduction should usually overcome any potential domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so using shufps is a pure win. So the test case diffs all appear to be improvements except one test in vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate zero elements and one test in combine-sra.ll where multiple uses prevent the expected shuffle combining. Differential Revision: https://reviews.llvm.org/D27692 llvm-svn: 289837		2016-12-15 18:03:38 +00:00
..
Analysis	[CostModel] Fix long standing bug with reverse shuffle mask detection	2016-12-15 12:12:45 +00:00
AsmParser	Replace APFloatBase static fltSemantics data members with getter functions	2016-12-14 11:57:17 +00:00
Bitcode	Replace APFloatBase static fltSemantics data members with getter functions	2016-12-14 11:57:17 +00:00
CodeGen	Extract LaneBitmask into a separate type	2016-12-15 14:36:06 +00:00
DebugInfo	Add the ability to get attribute values as Optional<T>	2016-12-14 22:38:08 +00:00
Demangle	Demangle: remove references to allocator for default allocator	2016-11-20 00:20:27 +00:00
ExecutionEngine	Replace APFloatBase static fltSemantics data members with getter functions	2016-12-14 11:57:17 +00:00
Fuzzer	[libFuzzer] enable the failure-resistant merge by default (with trace-pc-guard only)	2016-12-15 06:21:21 +00:00
IR	[DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of FileName and Directory.	2016-12-14 20:24:54 +00:00
IRReader	Timer: Track name and description.	2016-11-18 19:43:18 +00:00
LTO	[LTO] Reject modules without datalayout.	2016-12-14 21:57:04 +00:00
LibDriver	LibDriver: Reject inputs that are not COFF objects or bitcode files.	2016-12-14 22:19:22 +00:00
LineEditor	…
Linker	[ThinLTO] Import only necessary DICompileUnit fields	2016-12-12 16:09:30 +00:00
MC	Extract LaneBitmask into a separate type	2016-12-15 14:36:06 +00:00
Object	Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules.	2016-12-13 20:20:17 +00:00
ObjectYAML	[ARM] Implement execute-only support in CodeGen	2016-12-15 07:59:08 +00:00
Option	Generalize ArgList::AddAllArgs more	2016-09-29 19:47:58 +00:00
Passes	Remove the AssumptionCache	2016-12-15 03:02:15 +00:00
ProfileData	Make the Error class constructor protected	2016-11-11 04:28:40 +00:00
Support	Include <cstdarg> in PrettyStackTrace.cpp, fixing the bots.	2016-12-14 19:19:53 +00:00
TableGen	[TableGen] Centralize/Unify error handling.	2016-12-05 22:58:01 +00:00
Target	[x86] use a single shufps when it can save instructions	2016-12-15 18:03:38 +00:00
Transforms	Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst"	2016-12-15 16:59:13 +00:00
CMakeLists.txt	Try to fix a circular dependency in the modules build.	2016-09-06 20:16:19 +00:00
LLVMBuild.txt	Add an c++ itanium demangler to llvm.	2016-09-06 19:16:48 +00:00