Commit Graph

7 Commits

Author SHA1 Message Date
Matthew Simpson 476c0afc01 [ARM, AArch64] Match additional patterns to ldN instructions
When matching an interleaved load to an ldN pattern, the interleaved access
pass checks that all users of the load are shuffles. If the load is used by an
instruction other than a shuffle, the pass gives up and an ldN is not
generated. This patch considers users of the load that are extractelement
instructions. It attempts to modify the extracts to use one of the available
shuffles rather than the load. After the transformation, the load is only used
by shuffles and will then be matched with an ldN pattern.

Differential Revision: http://reviews.llvm.org/D20250

llvm-svn: 270142
2016-05-19 21:39:00 +00:00
Matthew Simpson 330a125542 [ARM, AArch64] Properly initialize InterleavedAccessPass
InterleavedAccessPass is an IR-level pass, so this change will enable testing
it with opt. This is part of D20250.

llvm-svn: 270101
2016-05-19 20:08:32 +00:00
Chad Rosier a306eeb252 Cleanup comments. NFC.
llvm-svn: 268233
2016-05-02 14:32:17 +00:00
Silviu Baranga 6d3f05c04b [ARM][AArch64] Turn on by default interleaved access lowering
Summary:
Interleaved access lowering removes a memory operation and a
sequence of vector shuffles and replaces it with a series of
memory operations. This should be always beneficial.

This pass in only enabled on ARM/AArch64.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12145

llvm-svn: 246540
2015-09-01 11:12:35 +00:00
Nico Rieck 78199518c4 Rename inst_range() to instructions() for consistency. NFC
llvm-svn: 244248
2015-08-06 19:10:45 +00:00
Hao Liu b41c0b44af [InterleavedAccess] Fix failures "undefined type 'llvm::raw_ostream'" on windows.
llvm-svn: 240760
2015-06-26 04:38:21 +00:00
Hao Liu 1c1e0c9e71 [InterleavedAccess] Add a pass InterleavedAccess to identify interleaved memory accesses and transform into target specific intrinsics.
E.g. An interleaved load (Factor = 2):
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr
        %v0 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <0, 2, 4, 6>
        %v1 = shuffle <8 x i32> %wide.vec, <8 x i32> undef, <1, 3, 5, 7>
It can be transformed into a ld2 intrinsic in AArch64 backend or a vld2 intrinsic in ARM backend.

E.g. An interleaved store (Factor = 3):
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr
It can be transformed into a st3 intrinsic in AArch64 backend or a vst3 intrinsic in ARM backend.

Differential Revision: http://reviews.llvm.org/D10533

llvm-svn: 240751
2015-06-26 02:10:27 +00:00