Go to file
Artur Pilipenko 41c0005aa3 [DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2.
The previous patch (https://reviews.llvm.org/rL289538) got reverted because of a bug. Chandler also requested some changes to the algorithm.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161212/413479.html

This is an updated patch. The key difference is that collectBitProviders (renamed to calculateByteProvider) now collects the origin of one byte, not the whole value. It simplifies the implementation and allows to stop the traversal earlier if we know that the result won't be used.

From the original commit:

Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.

Assuming little endian target:
  i8 *a = ...
  i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
  i32 val = *((i32)a)

  i8 *a = ...
  i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
  i32 val = BSWAP(*((i32)a))

This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.

Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
  i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)

Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.

The general scheme is to match OR expressions by recursively calculating the origin of individual bytes which constitute the resulting OR value. If all the OR bytes come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.

Reviewed By: RKSimon, filcab, chandlerc 

Differential Revision: https://reviews.llvm.org/D27861

llvm-svn: 293036
2017-01-25 08:53:31 +00:00
clang Revert "Use filename in linemarker when compiling preprocessed source" 2017-01-25 07:27:05 +00:00
clang-tools-extra [clang-tidy] Don't modernize-raw-string-literal if replacement is longer. 2017-01-24 15:18:11 +00:00
compiler-rt [XRay][compiler-rt] XRay Flight Data Recorder Mode 2017-01-25 03:50:46 +00:00
debuginfo-tests New round of fixes for "Always compile debuginfo-tests for the host triple" 2014-10-18 23:47:59 +00:00
libclc math: Add logb builtin 2017-01-18 03:14:10 +00:00
libcxx Implement LWG2556: Wide contract for future::share() 2017-01-24 23:28:25 +00:00
libcxxabi cxa_demangle: fix rvalue ref check 2017-01-24 19:57:05 +00:00
libunwind DWARF: fix -Asserts builds 2017-01-25 02:27:45 +00:00
lld Add a file comment to SyntheticSections.h. 2017-01-24 21:35:25 +00:00
lldb Reverted 292880 to fix a linker error. 2017-01-25 05:39:14 +00:00
llgo [llgo] Remove support for LLVM attributes 2016-12-06 19:22:04 +00:00
llvm [DAGCombiner] Match load by bytes idiom and fold it into a single load. Attempt #2. 2017-01-25 08:53:31 +00:00
openmp Use C++11 static_assert() for build asserts. 2017-01-18 07:49:30 +00:00
parallel-libs [Axccel] Remove -Wno-missing-braces in build 2016-12-19 21:34:07 +00:00
polly BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts 2017-01-19 14:12:45 +00:00