llvm-project/llvm/lib
Sanjay Patel a0d8a278a7 [x86] use a single shufps when it can save instructions
This is a tiny patch with a big pile of test changes.
This partially fixes PR27885:
https://llvm.org/bugs/show_bug.cgi?id=27885

My motivating case looks like this:

  - vpshufd {{.*#+}} xmm1 = xmm1[0,1,0,2]
  - vpshufd {{.*#+}} xmm0 = xmm0[0,2,2,3]
  - vpblendw {{.*#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7]

  + vshufps {{.*#+}} xmm0 = xmm0[0,2],xmm1[0,2]

And this happens several times in the diffs. For chips with domain-crossing penalties,
the instruction count and size reduction should usually overcome any potential 
domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such
as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so
using shufps is a pure win.

So the test case diffs all appear to be improvements except one test in 
vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate 
zero elements and one test in combine-sra.ll where multiple uses prevent the expected
shuffle combining.

Differential Revision: https://reviews.llvm.org/D27692

llvm-svn: 289837
2016-12-15 18:03:38 +00:00
..
Analysis [CostModel] Fix long standing bug with reverse shuffle mask detection 2016-12-15 12:12:45 +00:00
AsmParser Replace APFloatBase static fltSemantics data members with getter functions 2016-12-14 11:57:17 +00:00
Bitcode Replace APFloatBase static fltSemantics data members with getter functions 2016-12-14 11:57:17 +00:00
CodeGen Extract LaneBitmask into a separate type 2016-12-15 14:36:06 +00:00
DebugInfo Add the ability to get attribute values as Optional<T> 2016-12-14 22:38:08 +00:00
Demangle Demangle: remove references to allocator for default allocator 2016-11-20 00:20:27 +00:00
ExecutionEngine Replace APFloatBase static fltSemantics data members with getter functions 2016-12-14 11:57:17 +00:00
Fuzzer [libFuzzer] enable the failure-resistant merge by default (with trace-pc-guard only) 2016-12-15 06:21:21 +00:00
IR [DebugInfo] Changed DIBuilder::createCompileUnit() to take DIFile instead of FileName and Directory. 2016-12-14 20:24:54 +00:00
IRReader Timer: Track name and description. 2016-11-18 19:43:18 +00:00
LTO [LTO] Reject modules without datalayout. 2016-12-14 21:57:04 +00:00
LibDriver LibDriver: Reject inputs that are not COFF objects or bitcode files. 2016-12-14 22:19:22 +00:00
LineEditor
Linker [ThinLTO] Import only necessary DICompileUnit fields 2016-12-12 16:09:30 +00:00
MC Extract LaneBitmask into a separate type 2016-12-15 14:36:06 +00:00
Object Object: Make IRObjectFile own multiple modules and enumerate symbols from all modules. 2016-12-13 20:20:17 +00:00
ObjectYAML [ARM] Implement execute-only support in CodeGen 2016-12-15 07:59:08 +00:00
Option Generalize ArgList::AddAllArgs more 2016-09-29 19:47:58 +00:00
Passes Remove the AssumptionCache 2016-12-15 03:02:15 +00:00
ProfileData Make the Error class constructor protected 2016-11-11 04:28:40 +00:00
Support Include <cstdarg> in PrettyStackTrace.cpp, fixing the bots. 2016-12-14 19:19:53 +00:00
TableGen [TableGen] Centralize/Unify error handling. 2016-12-05 22:58:01 +00:00
Target [x86] use a single shufps when it can save instructions 2016-12-15 18:03:38 +00:00
Transforms Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of common inst" 2016-12-15 16:59:13 +00:00
CMakeLists.txt Try to fix a circular dependency in the modules build. 2016-09-06 20:16:19 +00:00
LLVMBuild.txt Add an c++ itanium demangler to llvm. 2016-09-06 19:16:48 +00:00