When tracking defined lanes through phi nodes in the live range
graph each branch of the phi must be handled independently.
Also rewrite the marking algorithm to reduce unnecessary
operations.
Previously a shared set of defined lanes was used which caused
marking to stop prematurely. This was observable in existing lit
tests, but test patterns did not cover this detail.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D98614
The patch in question broke the build with shared libraries due to
missing dependencies, one of which would have been circular between
MLIRStandard and MLIRMemRef if added. Fix this by moving more code
around and swapping the dependency direction. MLIRMemRef now depends on
MLIRStandard, but MLIRStandard does _not_ depend on MLIRMemRef.
Arguably, this is the right direction anyway since numerous libraries
depend on MLIRStandard and don't necessarily need to depend on
MLIRMemref.
Other otable changes include:
- some EDSC code is moved inline to MemRef/EDSC/Intrinsics.h because it
creates MemRef dialect operations;
- a utility function related to shape moved to BuiltinTypes.h/cpp
because it only realtes to shaped types and not any particular dialect
(standard dialect is erroneously believed to contain MemRefType);
- a Python test for the standard dialect is disabled completely because
the ops it tests moved to the new MemRef dialect, but it is not
exposed to Python bindings, and the change for that is non-trivial.
The test optnone-simple-functions.cpp added in D97668 fails on macOS.
os.path.exists raises an exception because we pass it None. Guard against this.
Related revision: https://reviews.llvm.org/D97668
TestExitDuringExpression test_exit_before_one_thread_unwind fails
sporadically on arm/linux. This seems like a thread timing issue.
I am marking it skip for now.
On 64-bit systems with small VMAs (e.g. 39-bit) we can't use
`SizeClassAllocator64` parameterized with size class maps containing a
large number of classes, as that will make the allocator region size too
small (< 2^32). Several tests were already disabled for Android because
of this.
This patch provides the correct allocator configuration for RISC-V
(riscv64), generalizes the gating condition for tests that can't be
enabled for small VMA systems, and tweaks the tests that can be made
compatible with those systems to enable them.
Differential Revision: https://reviews.llvm.org/D97234
Generate a json file containing descriptions of AST classes and their
public accessors which return SourceLocation or SourceRange.
Use the JSON file to generate a C++ API and implementation for accessing
the source locations and method names for accessing them for a given AST
node.
This new API can be used to implement 'srcloc' output in clang-query:
http://ce.steveire.com/z/m_kTIo
The JSON file can also be used to generate bindings for other languages,
such as Python and Javascript:
https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching
In this first version of this feature, only the accessors for Stmt
classes are generated, not Decls, TypeLocs etc. Those can be added
after this change is reviewed, as this change is mostly about
infrastructure of these code generators.
Also in this version, the platforms/cmake configurations are excluded as
much as possible so that support can be added iteratively. Currently a
break on any platform causes a revert of the entire feature. This way,
the `OR WIN32` can be removed in a future commit and if it breaks the
buildbots, only that commit gets reverted, making the entire process
easier to manage.
Differential Revision: https://reviews.llvm.org/D93164
Normally, this function just doesn't bother about cycles,
and hopes that the caller supplied small-enough depth
so that at worst it will take a potentially large,
but limited amount of time. But that obviously doesn't work
if there is no depth limit.
This reapples 36f1c3db66,
but without asserting, just bailout once cycle is detected.
This patch adds fixed-length vector support to the calling convention
when RVV is used to lower fixed-length vectors. The scheme follows the
regular vector calling convention for the argument/return registers, but
uses scalable vector container types as the LocVTs, and converts to/from
the fixed-length vector value types as required.
Fixed-length vector types may be split when the combination of minimum
VLEN and the maximum allowable LMUL is not large enough to fully contain
the vector. In this case the behaviour differs between fixed-length
vectors passed as parameters and as return values:
1. For return values, vectors must be passed entirely via registers or
via the stack.
2. For parameters, unlike scalar values, split vectors continue to be
passed by value, and are split across multiple registers until there are
no remaining registers. Thus vector parameters may be found partly in
registers and partly on the stack.
As with scalable vectors, the first fixed-length mask vector is passed
via v0. Split mask fixed-length vectors are passed first via v0 and then
via the next available vector register: v8,v9,etc.
The handling of vector return values uses all available argument
registers v8-v23 which does not adhere to the calling convention we're
supposedly implementing, but since this issue affects both fixed-length
and scalable-vector values, it was left as-is.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D97954
The reason for this is to avoid deep recursion in DFS() which can cause
stack overflow on large CFGs, especially on Windows.
Differential Revision: https://reviews.llvm.org/D98528
Start the description from a new line instead of putting the first
paragraph in the section header. Wrap the class name in backticks to
make it clear that it relates to the code.
For slow-hop targets, see if any single-op hops are duplicating work already done on another (dual-op) hop, which can sometimes occur as isHorizontalBinOp tries to find potential duplicates (but can't merge them itself). If so, reuse the other hop and shuffle the result.
This reverts commit b5d9a3c923.
The commit introduced a memory error in canonicalization/operation
walking that is exposed when compiled with ASAN. It leads to crashes in
some "release" configurations.
-mbranch-protection protects the LR on the stack with PAC.
When the frames are walked the LR need to be cleared.
This inline assembly later will be replaced with a new builtin.
Test: build with -DCMAKE_C_FLAGS="-mbranch-protection=standard".
Reviewed By: kubamracek
Differential Revision: https://reviews.llvm.org/D98008
Jeroen Dobbelaere in
https://lists.llvm.org/pipermail/llvm-dev/2021-March/149206.html
is reporting that this function can end up in an endless loop
when called from SROA w/ full restrict patches.
For now, simply ensure that such problems are caught earlier/easier.
We know that all ConstantLike operations have one result and no operands,
so check this first before doing the trait check. This change speeds up
Canonicalize on a CIRCT testcase by ~5%.
Differential Revision: https://reviews.llvm.org/D98615
Types of fractional LMUL and LMUL=1 are all using VR register class. When
using inline asm, it will use the first type in the register class as the
type for the register. It is not necessary the same as the value type. We
need to use INSERT_SUBVECTOR/EXTRACT_SUBVECToR/BITCAST to make it legal
to put the value in the corresponding register class.
Differential Revision: https://reviews.llvm.org/D97480
Two changes:
1) Change the canonicalizer to walk the function in top-down order instead of
bottom-up order. This composes well with the "top down" nature of constant
folding and simplification, reducing iterations and re-evaluation of ops in
simple cases.
2) Explicitly enter existing constants into the OperationFolder table before
canonicalizing. Previously we would "constant fold" them and rematerialize
them, wastefully recreating a bunch fo constants, which lead to pointless
memory traffic.
Both changes together provide a 33% speedup for canonicalize on some mid-size
CIRCT examples.
One artifact of this change is that the constants generated in normal pattern
application get inserted at the top of the function as the patterns are applied.
Because of this, we get "inverted" constants more often, which is an aethetic
change to the IR but does permute some testcases.
Differential Revision: https://reviews.llvm.org/D98609
I encountered a project that uses llvm that passes "generic" by
default. While I could fix that project, I wouldn't be surprised
if other projects did something similar. So it seems like
a good idea to provide a better error here.
I've also added validation of the 64Bit feature against the
triple so that we can catch a mismatched CPU before failing in
a mysterious way. We can make it pretty far in isel because we
calculate XLenVT from the triple and use that to set up the legal
integer type.
Reviewed By: luismarques, khchen
Differential Revision: https://reviews.llvm.org/D98307
Generate a json file containing descriptions of AST classes and their
public accessors which return SourceLocation or SourceRange.
Use the JSON file to generate a C++ API and implementation for accessing
the source locations and method names for accessing them for a given AST
node.
This new API can be used to implement 'srcloc' output in clang-query:
http://ce.steveire.com/z/m_kTIo
The JSON file can also be used to generate bindings for other languages,
such as Python and Javascript:
https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching
In this first version of this feature, only the accessors for Stmt
classes are generated, not Decls, TypeLocs etc. Those can be added
after this change is reviewed, as this change is mostly about
infrastructure of these code generators.
Also in this version, the platforms/cmake configurations are excluded as
much as possible so that support can be added iteratively. Currently a
break on any platform causes a revert of the entire feature. This way,
the `OR WIN32` can be removed in a future commit and if it breaks the
buildbots, only that commit gets reverted, making the entire process
easier to manage.
Differential Revision: https://reviews.llvm.org/D93164
Generate a json file containing descriptions of AST classes and their
public accessors which return SourceLocation or SourceRange.
Use the JSON file to generate a C++ API and implementation for accessing
the source locations and method names for accessing them for a given AST
node.
This new API can be used to implement 'srcloc' output in clang-query:
http://ce.steveire.com/z/m_kTIo
The JSON file can also be used to generate bindings for other languages,
such as Python and Javascript:
https://steveire.wordpress.com/2019/04/30/the-future-of-ast-matching
In this first version of this feature, only the accessors for Stmt
classes are generated, not Decls, TypeLocs etc. Those can be added
after this change is reviewed, as this change is mostly about
infrastructure of these code generators.
Also in this version, the platforms/cmake configurations are excluded as
much as possible so that support can be added iteratively. Currently a
break on any platform causes a revert of the entire feature. This way,
the `OR WIN32` can be removed in a future commit and if it breaks the
buildbots, only that commit gets reverted, making the entire process
easier to manage.
Differential Revision: https://reviews.llvm.org/D93164
The functionality is not posix specific. Also force the usage of the
gdb-remote process plugin in the gdb platform class.
This is not sufficient to make TestPlatformConnect pass on windows (it
seems it suffers from module loading issues, unrelated to this test),
but it at least makes it shut down correctly, so I change the skip to an
xfail.
D98289 was erroneously reporting `invalid range list offset 0x20110`
instead of `invalid range list table index 0`.
Differential Revision: https://reviews.llvm.org/D98589