Commit Graph

11 Commits

Author SHA1 Message Date
Michael Liao 5fbcd81793 Revise alignment checking/calculation on 256-bit unaligned memory access
- It's still considered aligned when the specified alignment is larger
  than the natural alignment;
- The new alignment for the high 128-bit vector should be min(16,
  alignment) as the pointer is advanced by 16, a power-of-2 offset.

llvm-svn: 177947
2013-03-25 23:50:10 +00:00
Michael Liao bb05a1d7b5 Enhance folding of (extract_subvec (insert_subvec V1, V2, IIdx), EIdx)
- Handle the case where the result of 'insert_subvect' is bitcasted
  before 'extract_subvec'. This removes the redundant insertf128/extractf128
  pair on unaligned 256-bit vector load/store on vectors of non 64-bit integer.

llvm-svn: 177945
2013-03-25 23:47:35 +00:00
Nadav Rotem adfa5eaf8c Unaligned loads should use the VMOVUPS opcode.
llvm-svn: 177130
2013-03-14 23:49:44 +00:00
Nadav Rotem 7b3120b9ae On Sandybridge split unaligned 256bit stores into two xmm-sized stores.
llvm-svn: 172894
2013-01-19 08:38:41 +00:00
Nadav Rotem baae7e4577 Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not zero untouched elements. Use INSERT_VECTOR_ELT instead.
llvm-svn: 147948
2012-01-11 14:07:51 +00:00
Jakob Stoklund Olesen 30c811246f Remove X86-dependent stuff from SSEDomainFix.
This also enables domain swizzling for AVX code which required a few
trivial test changes.

The pass will be moved to lib/CodeGen shortly.

llvm-svn: 140659
2011-09-27 23:50:46 +00:00
Bruno Cardoso Lopes ea8d803bb0 Fix PR10844: Add patterns to cover non foldable versions of X86vzmovl.
Triggered using llc -O0. Also fix some SET0PS patterns to their AVX
forms and test it on the testcase.

llvm-svn: 139304
2011-09-08 18:05:02 +00:00
Bruno Cardoso Lopes 9eb3762e08 Fix the test added by Nadav in r137308. Make it more strict:
1) check for the "v" version of movaps
2) add a couple of CHECK-NOT to guarantee the behavior
3) move to a more appropriate test file

llvm-svn: 137361
2011-08-11 21:50:35 +00:00
Bruno Cardoso Lopes fc481959d2 Add v16i16 and v32i8 store patterns
llvm-svn: 137166
2011-08-09 22:39:53 +00:00
Bruno Cardoso Lopes 2fc107365b Add two patterns to match special vmovss and vmovsd cases. Also fix
the patterns already there to be more strict regarding the predicate.
This fixes PR10558

llvm-svn: 137100
2011-08-09 01:43:09 +00:00
Bruno Cardoso Lopes 6778597deb Add 256-bit load/store recognition and matching in several places.
llvm-svn: 135171
2011-07-14 18:50:58 +00:00