Gather and Scatter are new introduced intrinsics, comming after recently implemented masked load and store.
This is the first patch for Gather and Scatter intrinsics. It includes only the syntax, parsing and verification.
Gather and Scatter intrinsics allow to perform multiple memory accesses (read/write) in one vector instruction.
The intrinsics are not target specific and will have the following syntax:
Gather:
declare <16 x i32> @llvm.masked.gather.v16i32(<16 x i32*> <vector of ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x i32> <passthru>)
declare <8 x float> @llvm.masked.gather.v8f32(<8 x float*><vector of ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float><passthru>)
Scatter:
declare void @llvm.masked.scatter.v8i32(<8 x i32><vector value to be stored> , <8 x i32*><vector of ptrs> , i32 <alignment>, <8 x i1> <mask>)
declare void @llvm.masked.scatter.v16i32(<16 x i32> <vector value to be stored> , <16 x i32*> <vector of ptrs>, i32 <alignment>, <16 x i1><mask> )
Vector of ptrs - a set of source/destination addresses, to load/store the value.
Mask - switches on/off vector lanes to prevent memory access for switched-off lanes
vector of ptrs, value and mask should have the same vector width.
These are code examples where gather / scatter should be used and will allow function vectorization
;void foo1(int * restrict A, int * restrict B, int * restrict C) {
; for (int i=0; i<SIZE; i++) {
; A[i] = B[C[i]];
; }
;}
;void foo3(int * restrict A, int * restrict B) {
; for (int i=0; i<SIZE; i++) {
; A[B[i]] = i+5;
; }
;}
Tests will come in the following patches, with CodeGen and Vectorizer.
http://reviews.llvm.org/D7433
llvm-svn: 228521
llvm-mode was previously confused when variable names contained keywords.
This changes ensures that keywords are only highlighted when they're standalone.
Patch by Wilfred Hughes!
llvm-svn: 228396
This is done in a bit of a strange way to use a multiline RE instead of
looping over the lines. Suggestions welcome here for a more pythonic way
of doing this as long as its reasonably fast.
llvm-svn: 228131
with 'stress' to indicate that the specific output isn't interesting and
relax them to only check the last instruction (a ret).
I've updated the one test case that really uses this to name the one
'stress_test' which was actually producing output we can directly check.
With this, the script doesn't introduce noise when run over the v16 test
file.
llvm-svn: 228033
Similar to the C++14 void specializations of these templates, useful as
a stop-gap until LLVM switches to '14.
Example use-cases in tblgen because I saw some functors that looked like
they could be simplified/refactored.
Reviewers: dexonsmith
Differential Revision: http://reviews.llvm.org/D7324
llvm-svn: 227828
The hot path through this region of code does lots of batch inserts into sets. By storing them as sorted arrays, we can defer the sorting to the end of the batch, which is dramatically more efficient. This reduces tblgen runtime by 25% on my worst-case target.
llvm-svn: 227682
This is a continuation of my prior work to move some of the inner workings for CodeGenRegister to use bit vectors when computing about register units. This is highly beneficial to TableGen runtime on targets with large, dense register files. This patch represents a ~40% runtime reduction over and above my earlier improvement on a stress test of this case.
llvm-svn: 227678
For target descriptions with very large and very dense register files, TableGen
can take an extremely long time to run. This change makes a dent in that (~15%
in my measurements) by accelerating the single hottest operation with better data
structures.
I believe there's still a lot of room to make this even faster with more global
changes that require replacing some of the existing datastructures in this area
with bit vectors, but that's a more involved change and I wanted to get this
simpler improvement in first.
llvm-svn: 227562
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
llvm-svn: 227113
In llvm-mode, with electric-pair-mode turned on, typing a literal '['
would print out '[[', and '(' would print a '(('. This was a very
annoying bug caused by overzealous syntax-table entries: the parens are
already part of the '(' and ')' class by default. Fix this.
While at it, notice that i32, i64, i1 etc. are not font-locked despite a
clear intent to do so. The issue is that regexp-opt doesn't accept
regular expressions. So, spell out the common literal integers with
different widths.
Differential Revision: http://reviews.llvm.org/D7036
llvm-svn: 226931
Make it clear that the "llvm.org" style is deriving from "gnu" style,
and use the c-mode-common-hook instead of c-mode-hook and c++-mode-hook.
Differential Revision: http://reviews.llvm.org/D7035
llvm-svn: 226861
Specifically, gc.result benefits from this greatly. Instead of:
gc.result.int.*
gc.result.float.*
gc.result.ptr.*
...
We now have a gc.result.* that can specialize to literally any type.
Differential Revision: http://reviews.llvm.org/D7020
llvm-svn: 226857
ELF linkers by default allow shared libraries to contain undefined references
and it is up to the dynamic linker to look for them.
On COFF and MachO, that is not the case.
This creates a situation where a .so might build on an ELF system, but the build
of the corresponding .dylib or .dll will fail.
This patch changes the cmake build to use -Wl,-z,defs when linking and updates
the dependencies so that -DBUILD_SHARED_LIBS=ON build still works.
llvm-svn: 226611
This patch was generated by a clang tidy checker that is being open sourced.
The documentation of that checker is the following:
/// The emptiness of a container should be checked using the empty method
/// instead of the size method. It is not guaranteed that size is a
/// constant-time function, and it is generally more efficient and also shows
/// clearer intent to use empty. Furthermore some containers may implement the
/// empty method but not implement the size method. Using empty whenever
/// possible makes it easier to switch to another container in the future.
Patch by Gábor Horváth!
llvm-svn: 226161
This adds support for creating an InstAlias with a negative immediate, i.e.:
def NOT : InstAlias<"not $dst, $src", (XORI GR32:$dst, GR32:$src, -1)>;
by resolving this problem:
RISCVGenAsmMatcher.inc:95:11: error: expected '= constant-expression' or end of enumerator definition
CVT_imm_-1,
^^^^^^^^^^
Patch by Jordy Potman, thanks!
llvm-svn: 226073
This adds assembly and bitcode support for `MDLocation`. The assembly
side is rather big, since this is the first `MDNode` subclass (that
isn't `MDTuple`). Part of PR21433.
(If you're wondering where the mountains of testcase updates are, we
don't need them until I update `DILocation` and `DebugLoc` to actually
use this class.)
llvm-svn: 225830
These intrinsics allow multiple functions to share a single stack
allocation from one function's call frame. The function with the
allocation may only perform one allocation, and it must be in the entry
block.
Functions accessing the allocation call llvm.recoverframeallocation with
the function whose frame they are accessing and a frame pointer from an
active call frame of that function.
These intrinsics are very difficult to inline correctly, so the
intention is that they be introduced rarely, or at least very late
during EH preparation.
Reviewers: echristo, andrew.w.kaylor
Differential Revision: http://reviews.llvm.org/D6493
llvm-svn: 225746
Instead, just present the command for committing it. This way,
the user can test the merge locally, resolve conflicts, etc.
before committing, which seems much safer to me.
llvm-svn: 225737
Summary: I think this is probably a bug, but I'm putting this up for review just to be sure. I think that `lit.util.capture` should decode the resulting string in the same way `lit.util.executeCommand` does.
Reviewers: ddunbar, EricWF
Reviewed By: EricWF
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6769
llvm-svn: 225681
This adds two new fields to the RegisterOperand TableGen class:
string OperandNamespace = "MCOI";
string OperandType = "OPERAND_REGISTER";
These fields can be used to specify a target specific operand type,
which will be stored in the OperandType member of the MCOperandInfo
object.
This can be useful for targets that need to store some extra information
about operands that cannot be expressed using the target independent
types. For example, in the R600 backend, there are operands which
can take either registers or immediates and it is convenient to be able
to specify this in the TableGen definitions.
llvm-svn: 225661
This script is currently specific to x86 and limited to use with very
small regression or feature tests using 'llc' and 'FileCheck' in
a reasonably canonical way. It is in no way general purpose or robust at
this point. However, it works quite well for simple examples. Here is
the intended workflow:
- Make a change that requires updating N test files and M functions'
assertions within those files.
- Stash the change.
- Update those N test files' RUN-lines to look "canonical"[1].
- Refresh the FileCheck lines for either the entire file or select
functions by running this script.
- The script will parse the RUN lines and run the 'llc' binary you
give it according to each line, collecting the asm.
- It will then annotate each function with the appropriate FileCheck
comments to check every instruction from the start of the first
basic block to the last return.
- There will be numerous cases where the script either fails to remove
the old lines, or inserts checks which need to be manually editted,
but the manual edits tend to be deletions or replacements of
registers with FileCheck variables which are fast manual edits.
- A common pattern is to have the script insert complete checking of
every instruction, and then edit it down to only check the relevant
ones.
- Be careful to do all of these cleanups though! The script is
designed to make transferring and formatting the asm output of llc
into a test case fast, it is *not* designed to be authoratitive
about what constitutes a good test!
- Commit the nice fresh baseline of checks.
- Unstash your change and rebuild llc.
- Re-run script to regenerate the FileCheck annotations
- Remember to re-cleanup these annotations!!!
- Check the diff to make sure this is sane, checking the things you
expected it to, and check that the newly updated tests actually pass.
- Profit!
Also, I'm *terrible* at writing Python, and frankly I didn't spend a lot
of time making this script beautiful or well engineered. But it's useful
to me and may be useful to others so I thought I'd send it out.
http://reviews.llvm.org/D5546
llvm-svn: 225618
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode). This adds the `distinct` keyword to assembly.
Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:
- self-references, which are never uniqued, and
- nodes whose operands are replaced that hit a uniquing collision.
The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.
Part of PR22111.
llvm-svn: 225474
* Both files have valid package headers and footers (you can verify
with M-x checkdoc).
* Fixed style warnings generated by checkdoc.
* Fixed a byte-compiler warning in llvm-mode.el.
* Ensure that the modes are autoloaded, so users do not need to
(require 'llvm-mode) to use them.
Patch by Wilfred Hughes.
llvm-svn: 225356
Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building.
llvm-svn: 225256
This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used.
Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction.
llvm-svn: 225075
This removes a hardcoded list of instructions in the CodeEmitter. Eventually I intend to remove the predicates on the affected instructions since in any given mode two of them are valid if we supported addr32/addr16 prefixes in the assembler.
llvm-svn: 224809