llvm-project/llvm
Sanne Wouda 2939fc13c8 [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q)
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:

   %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
   %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)

When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.

This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:

   int16x8_t v = {2,3,4,5,6,7,8,9};
   a = vqdmulh_laneq_s16(a, v, 0);
   b = vqdmulh_laneq_s16(b, v, 1);
   c = vqdmulh_laneq_s16(c, v, 2);
   d = vqdmulh_laneq_s16(d, v, 3);
   [...]

In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.

We could teach the compiler to recover the lane variants, but this would likely
require its own pass.  (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)

This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.

These 'lane' variants need an additional register class.  The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.

Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.

This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).

Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71469
2020-01-29 13:25:23 +00:00
..
benchmarks
bindings Pass length of string in Go binding of CreateCompileUnit 2020-01-17 13:35:30 -08:00
cmake [CMake] Set ASM compiler for external projects 2020-01-28 11:39:21 -08:00
docs Fix sphinx build bot failure. NFCI. 2020-01-28 22:07:34 +08:00
examples Fix conversions in clang and examples 2020-01-29 02:48:15 +01:00
include [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q) 2020-01-29 13:25:23 +00:00
lib [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q) 2020-01-29 13:25:23 +00:00
projects
resources
runtimes [runtimes] Fix passing lists to runtimes configures 2020-01-28 14:36:14 -08:00
test [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q) 2020-01-29 13:25:23 +00:00
tools One more bugpoitn fix for GCC5 2020-01-29 03:42:02 +01:00
unittests [DebugInfo] Make most debug line prologue errors non-fatal to parsing 2020-01-29 10:23:41 +00:00
utils Address implicit conversions detected by g++ 5 only. 2020-01-29 01:01:09 +01:00
.arcconfig
.clang-format
.clang-tidy
.gitattributes
.gitignore [llvm] Fix file ignoring inside directories 2020-01-27 17:00:33 -08:00
CMakeLists.txt Bump the trunk major version to 11 2020-01-15 13:38:01 +01:00
CODE_OWNERS.TXT [VE] Target stub for NEC SX-Aurora 2020-01-09 11:17:35 +01:00
CREDITS.TXT
LICENSE.TXT
LLVMBuild.txt
README.txt
RELEASE_TESTERS.TXT
configure
llvm.spec.in

README.txt

The LLVM Compiler Infrastructure
================================

This directory and its subdirectories contain source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments.

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the documentation provided in docs/ for further
assistance with LLVM, and in particular docs/GettingStarted.rst for getting
started with LLVM and docs/README.txt for an overview of LLVM's
documentation setup.

If you are writing a package for LLVM, see docs/Packaging.rst for our
suggestions.