Go to file
Sanne Wouda 2939fc13c8 [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q)
Summary:
Currently, sqdmulh_lane and friends from the ACLE (implemented in arm_neon.h),
are represented in LLVM IR as a (by vector) sqdmulh and a vector of (repeated)
indices, like so:

   %shuffle = shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
   %vqdmulh2.i = tail call <4 x i16> @llvm.aarch64.neon.sqdmulh.v4i16(<4 x i16> %a, <4 x i16> %shuffle)

When %v's values are known, the shufflevector is optimized away and we are no
longer able to select the lane variant of sqdmulh in the backend.

This defeats a (hand-coded) optimization that packs several constants into a
single vector and uses the lane intrinsics to reduce register pressure and
trade-off materialising several constants for a single vector load from the
constant pool, like so:

   int16x8_t v = {2,3,4,5,6,7,8,9};
   a = vqdmulh_laneq_s16(a, v, 0);
   b = vqdmulh_laneq_s16(b, v, 1);
   c = vqdmulh_laneq_s16(c, v, 2);
   d = vqdmulh_laneq_s16(d, v, 3);
   [...]

In one microbenchmark from libjpeg-turbo this accounts for a 2.5% to 4%
performance difference.

We could teach the compiler to recover the lane variants, but this would likely
require its own pass.  (Alternatively, "volatile" could be used on the constants
vector, but this is a bit ugly.)

This patch instead implements the following LLVM IR intrinsics for AArch64 to
maintain the original structure through IR optmization and into instruction
selection:
- sqdmulh_lane
- sqdmulh_laneq
- sqrdmulh_lane
- sqrdmulh_laneq.

These 'lane' variants need an additional register class.  The second argument
must be in the lower half of the 64-bit NEON register file, but only when
operating on i16 elements.

Note that the existing patterns for shufflevector and sqdmulh into sqdmulh_lane
(etc.) remain, so code that does not rely on NEON intrinsics to generate these
instructions is not affected.

This patch also changes clang to emit these IR intrinsics for the corresponding
NEON intrinsics (AArch64 only).

Reviewers: SjoerdMeijer, dmgreen, t.p.northover, rovka, rengolin, efriedma

Reviewed By: efriedma

Subscribers: kristof.beyls, hiraditya, jdoerfert, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D71469
2020-01-29 13:25:23 +00:00
clang [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q) 2020-01-29 13:25:23 +00:00
clang-tools-extra [clangd][vscode] Update lsp dependencies to pickup the progress support in LSP 3.15 2020-01-29 12:58:53 +01:00
compiler-rt [asan] Fix test compilation on Android API <= 17 2020-01-28 14:36:19 -08:00
debuginfo-tests Add pretty printers for llvm::PointerIntPair and llvm::PointerUnion. 2020-01-27 17:23:59 +01:00
libc Fix missing dependency in LibcUnitTest 2020-01-27 10:14:39 +01:00
libclc libclc: Drop the old python based build system 2019-11-08 09:59:40 -05:00
libcxx [libcxx] Link against android_support when needed 2020-01-28 14:36:24 -08:00
libcxxabi [libcxx] Link against android_support when needed 2020-01-28 14:36:24 -08:00
libunwind [libunwind] Treat assembly files as C on mingw 2020-01-27 09:04:58 +02:00
lld [LLD][ELF][ARM] Do not substitute BL/BLX for non STT_FUNC symbols. 2020-01-29 11:42:25 +00:00
lldb [DebugInfo] Make most debug line prologue errors non-fatal to parsing 2020-01-29 10:23:41 +00:00
llgo IR: Support parsing numeric block ids, and emit them in textual output. 2019-03-22 18:27:13 +00:00
llvm [AArch64] Add IR intrinsics for sq(r)dmulh_lane(q) 2020-01-29 13:25:23 +00:00
mlir [MLIR] Add OpenMP dialect with barrier operation 2020-01-29 11:34:58 +00:00
openmp Revert "[nfc][libomptarget] Remove SHARED annotation from local variables" 2020-01-27 20:05:17 +00:00
parallel-libs Fix typos throughout the license files that somehow I and my reviewers 2019-01-21 09:52:34 +00:00
polly Fix polly build after StringRef change. 2020-01-28 19:44:20 -08:00
pstl Bump the trunk major version to 11 2020-01-15 13:38:01 +01:00
.arcconfig Include phabricator.uri in .arcconfig 2020-01-23 11:50:18 -08:00
.clang-format Add .clang-tidy and .clang-format files to the toplevel of the 2019-01-29 16:43:16 +00:00
.clang-tidy Disable tidy checks with too many hits 2019-02-01 11:20:13 +00:00
.git-blame-ignore-revs Add LLDB reformatting to .git-blame-ignore-revs 2019-09-04 09:31:55 +00:00
.gitignore Add a newline at the end of the file 2019-09-04 06:33:46 +00:00
CONTRIBUTING.md Add contributing info to CONTRIBUTING.md and README.md 2019-12-02 15:47:15 +00:00
README.md Add contributing info to CONTRIBUTING.md and README.md 2019-12-02 15:47:15 +00:00

README.md

The LLVM Compiler Infrastructure

This directory and its subdirectories contain source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and runtime environments.

The README briefly describes how to get started with building LLVM. For more information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting Started with the LLVM System

Taken from https://llvm.org/docs/GettingStarted.html.

Overview

Welcome to the LLVM project!

The LLVM project has multiple components. The core of the project is itself called "LLVM". This contains all of the tools, libraries, and header files needed to process intermediate representations and converts it into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer. It also contains basic regression tests.

C-like languages use the Clang front end. This component compiles C, C++, Objective C, and Objective C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

The LLVM Getting Started documentation may be out of date. The Clang Getting Started page might have more accurate information.

This is an example workflow and configuration to get and build the LLVM source:

  1. Checkout LLVM (including related subprojects like Clang):

    • git clone https://github.com/llvm/llvm-project.git

    • Or, on windows, git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git

  2. Configure and build LLVM and Clang:

    • cd llvm-project

    • mkdir build

    • cd build

    • cmake -G <generator> [options] ../llvm

      Some common generators are:

      • Ninja --- for generating Ninja build files. Most llvm developers use Ninja.
      • Unix Makefiles --- for generating make-compatible parallel makefiles.
      • Visual Studio --- for generating Visual Studio projects and solutions.
      • Xcode --- for generating Xcode projects.

      Some Common options:

      • -DLLVM_ENABLE_PROJECTS='...' --- semicolon-separated list of the LLVM subprojects you'd like to additionally build. Can include any of: clang, clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld, polly, or debuginfo-tests.

        For example, to build LLVM, Clang, libcxx, and libcxxabi, use -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi".

      • -DCMAKE_INSTALL_PREFIX=directory --- Specify for directory the full pathname of where you want the LLVM tools and libraries to be installed (default /usr/local).

      • -DCMAKE_BUILD_TYPE=type --- Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug.

      • -DLLVM_ENABLE_ASSERTIONS=On --- Compile with assertion checks enabled (default is Yes for Debug builds, No for all other build types).

    • Run your build tool of choice!

      • The default target (i.e. ninja or make) will build all of LLVM.

      • The check-all target (i.e. ninja check-all) will run the regression tests to ensure everything is in working order.

      • CMake will generate build targets for each tool and library, and most LLVM sub-projects generate their own check-<project> target.

      • Running a serial build will be slow. To improve speed, try running a parallel build. That's done by default in Ninja; for make, use make -j NNN (NNN is the number of parallel jobs, use e.g. number of CPUs you have.)

    • For more information see CMake

Consult the Getting Started with LLVM page for detailed information on configuring and compiling LLVM. You can visit Directory Layout to learn about the layout of the source code tree.