We want to deal with non-default constructors that just happen to
contain constant initializers. There was already a negative test case,
it is now a positive one. We find and refactor this case:
struct PositiveNotDefaultInt {
PositiveNotDefaultInt(int) : i(7) {}
int i;
};
Address the advice proposed at patch D105429 . Use [Low, Low+size) to represent bits.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107250
This is a preparation for another change in the watchOS/tvOS availability logic. It is extracted into a separate commit to simplify reviewing and to keep the linter happy at the same time.
rdar://81491680
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D116459
Skip operation on the lower byte in int16 logical left shift when
shift amount is greater than 8.
Skip operation on the higher byte in int16 logical & arithmetic
right shift when shift amount is greater than 8.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D115594
This patch extends `emitUnaryBuiltin` so that we can better emitting IR when
implement builtins specified in D111529.
Also contains some NFC, applying it to existing code.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D116161
This fold is not correct, because indices might evaluate to zero
even if they are not a literal zero integer. Additionally, this
fold would be wrong (in the general case) for non-i8 types as well,
due to index overflow.
Drop this fold and instead let the target-dependent constant
folder compute the actual offset and fold the comparison based
on that.
This fold is incorrect, because it assumes that all indices are
non-zero. This happens to be true for the test as written, but
doesn't hold if we use an extern weak global instead, for which
ptrtoint might be zero.
Add separate tests for the simple constant int case.
Along with the new language mode this commit contains misc
small updates for OpenCL 3 and GitHub issues for OpenCL.
Differential Revision: https://reviews.llvm.org/D116271
Previously, it was in canSafelySkipNode, which is only used to decide
whether we should descend into it and its children, and we still used
the incomplete Decltypeloc.getSourceRange() to claim tokens, which will
cause some tokens were not claimed correctly.
Separate a change of https://reviews.llvm.org/D116536
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D116586
This involves separating out the concepts of "which tokens should we
descend into this node for" vs "which tokens should this node claim".
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D116218
For loops that contain in-loop reductions but no loads or stores, large
VFs are chosen because LoopVectorizationCostModel::getSmallestAndWidestTypes
has no element types to check through and so returns the default widths
(-1U for the smallest and 8 for the widest). This results in the widest
VF being chosen for the following example,
float s = 0;
for (int i = 0; i < N; ++i)
s += (float) i*i;
which, for more computationally intensive loops, leads to large loop
sizes when the operations end up being scalarized.
In this patch, for the case where ElementTypesInLoop is empty, the widest
type is determined by finding the smallest type used by recurrences in
the loop instead of falling back to a default value of 8 bits. This
results in the cost model choosing a more sensible VF for loops like
the one above.
Differential Revision: https://reviews.llvm.org/D113973
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed-by: xazax.hun
Differential Revision: https://reviews.llvm.org/D116368
Indirect inline asm operands may require the materialization of a
memory access according to the pointer element type. As this will
no longer be available with opaque pointers, we require it to be
explicitly annotated using the elementtype attribute, for example:
define void @test(i32* %p, i32 %x) {
call void asm "addl $1, $0", "=*rm,r"(i32* elementtype(i32) %p, i32 %x)
ret void
}
This patch only includes the LangRef change and Verifier updates to
allow adding the elementtype attribute in this position. It does not
yet enforce this, as this will require changes on the clang side
(and test updates) first.
Something I'm a bit unsure about is whether we really need the
elementtype for all indirect constraints, rather than only indirect
register constraints. I think indirect memory constraints might not
strictly need it (though the backend code is written in a way that
does require it). I think it's okay to just make this a general
requirement though, as this means we don't need to carefully deal
with multiple or alternative constraints. In addition, I believe
that MemorySanitizer benefits from having the element type even in
cases where it may not be strictly necessary for normal lowering
(cd2b050fa4/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (L4066)).
Differential Revision: https://reviews.llvm.org/D116531
Currently, the code in TargetLoweringObjectFile only assigns
@init_array section type to plain .init_array sections, but not
prioritized sections like .init_array.00001.
This is inconsistent with the interpretation in the AsmParser
(see 791523bae6/llvm/lib/MC/MCParser/ELFAsmParser.cpp (L621-L632))
and upcoming expectations in LLD
(see https://github.com/rust-lang/rust/issues/92181 for context).
This patch assigns @init_array section type to all sections with an
.init_array prefix. The same is done for .fini_array and
.preinit_array as well. With that, the logic matches the AsmParser.
Differential Revision: https://reviews.llvm.org/D116528
Global ctor evaluation currently models memory as a map from Constant*
to Constant*. For this to be correct, it is required that there is
only a single Constant* referencing a given memory location. The
Evaluator tries to ensure this by imposing certain limitations that
could result in ambiguities (by limiting types, casts and GEP formats),
but ultimately still fails, as can be seen in PR51879. The approach
is fundamentally fragile and will get more so with opaque pointers.
My original thought was to instead store memory for each global as an
offset => value representation. However, we also need to make sure
that we can actually rematerialize the modified global initializer
into a Constant in the end, which may not be possible if we allow
arbitrary writes.
What this patch does instead is to represent globals as a MutableValue,
which is either a Constant* or a MutableAggregate*. The mutable
aggregate exists to allow efficient mutation of individual aggregate
elements, as mutating an element on a Constant would require interning
a new constant. When a write to the Constant* is made, it is converted
into a MutableAggregate* as needed.
I believe this should make the evaluator more robust, compatible
with opaque pointers, and a bit simpler as well.
Fixes https://github.com/llvm/llvm-project/issues/51221.
Differential Revision: https://reviews.llvm.org/D115530
Exporting a llvm.landingpad operation with the cleanup flag set is currently ignored by the export code.
Differential Revision: https://reviews.llvm.org/D116565
By default, the listeners do nothing unless linked in.
This revision allows the "Perf" and "Intel" Jit event listeners to be used.
The "OProfile" event listener is not enabled at this time,
the associated library structure is not well-isolated.
Differential Revision: https://reviews.llvm.org/D116552
The inlined llvm.coro.id should contain the function it refers to.
The modifed test would caused the compiler crash under O2. See
issue52912 for example.
This diff adds support to printing a Value when it is null. We encounter this situation when debugging the PDL bytcode execution (where a null Value is perfectly valid). Currently, the AsmPrinter crashes (with an assert in a cast) when it encounters such Value.
We follow the same format used in other printed entities (e.g., null attribute).
Reviewed By: mehdi_amini, bondhugula
Differential Revision: https://reviews.llvm.org/D116084
Presently the result type verification checks if the type is used by a `pdl::OperationOp` inside the matcher. This is unnecessarily restrictive; the type could come from a `pdl::OperandOp or `pdl::OperandsOp` and still be inferrable.
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D116083
This diff adds an integration test to multi-root PDL matching. It consists of two subtests:
1) A 1-layer perceptron with split forward / backward operations.
2) A 2-layer perceptron with fused forward / backward operations.
These tests use a collection of hand-written patterns and TensorFlow operations to be matched. The first test has a DAG / SSA dominant resulting match; the second does not and is therefore stored in a graph region.
This diff also includes two bug fixes:
1) Mark the pdl_interp dialect as a dependent in the TestPDLByteCodePass. This is needed, because we create ops from that dialect as a part of the PDL-to-PDLInterp lowering.
2) Fix of the starting index in the liveness range for the ForEach operations (bug exposed by the integration test).
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116082
The tree merging of pattern predicates places the predicates in an unordered set. When the predicates are sorted, they are taken in the set order, not the insertion order. This results in nondeterministic behavior.
One solution to this problem would be to use `SetVector`. However, the value `SetVector` does not provide a `find` function for fast O(1) lookups and stores the predicates twice -- once in the set and once in the vector, which is undesirable, because we store patternToAnswer in each predicate. A simpler solution is to store the tie breaking ID (which follows the insertion order), and use this ID to break any ties when comparing predicates.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116081
When the original version of multi-root patterns was reviewed, several improvements were made to the pdl_interp operations during the review process. Specifically, the "get users of a value at the specified operand index" was split up into "get users" and "compare the users' operands with that value". The iterative execution was also cleaned up to `pdl_interp.foreach`. However, the positions in the pdl-to-pdl_interp lowering were not similarly refactored. This introduced several problems, including hard-to-detect bugs in the lowering and duplicate evaluation of `pdl_interp.get_users`.
This diff cleans up the positions. The "upward" `OperationPosition` was split-out into `UsersPosition` and `ForEachPosition`, and the operand comparison was replaced with a simple predicate. In the process, I fixed three bugs:
1. When multiple roots were had the same connector (i.e., a node that they shared with a subtree at the previously visited root), we would generate a single foreach loop rather than one foreach loop for each such root. The reason for this is that such connectors shared the position. The solution for this is to add root index as an id to the newly introduced `ForEachPosition`.
2. Previously, we would use `pdl_interp.get_operands` indiscriminately, whether or not the operand was variadic. We now correctly detect variadic operands and insert `pdl_interp.get_operand` when needed.
3. In certain corner cases, we would trigger the "connector has not been traversed yet" assertion. This was caused by not inserting the values during the upward traversal correctly. This has now been fixed.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D116080
The result value of a llvm.invoke operation is currently not mapped to the corresponding llvm::Value* when exporting to LLVM IR. This leads to any later operations using the result to crash as it receives a nullptr.
Differential Revision: https://reviews.llvm.org/D116564