This has two advantages.
1. It is more efficient. No need to clone the entire region.
2. Recreating ops (via cloning) invalidates analysis results. Previously, an OpResult could have bufferized out-of-place, even though the analysis requested an in-place bufferization. That is because BufferizationState keeps track of OpResults for storing bufferization analysis results (and cloned ops have new OpResults).
Differential Revision: https://reviews.llvm.org/D116453
D:\git\llvm-project\clang\unittests\Analysis\FlowSensitive\MultiVarConstantPropagationTest.cpp(104) : warning C4715: 'clang::dataflow::`anonymous namespace'::operator<<': not all control paths return a value
In addition, all functions that call `allocationFn` now return FailureOr<Value>. This resolves a few TODOs in the code base.
Differential Revision: https://reviews.llvm.org/D116452
Update prettyprinters.py to match MLIR changes.
This has gone unnoticed because no build bot is running tests with debug info.
I will look into what we can do about this separately. There is
https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/,
from Apple. The Debug Info tests are failing despite the green result.
See https://github.com/llvm/llvm-project/issues/48872.
Note: the llvm-support.gdb test only works with Debug,
but not RelWithDebInfo because some checked symbols are stripped.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D116646
If `createDealloc` is deactivated (enabled by default), newly allocated buffers are not deallocated anymore. In such a case, the missing deallocations can be inserted by the existing "BufferDeallocation" pass.
This change is needed for unifying core bufferization and Comprehensive Bufferize. Core bufferization has a separate pass for generating deallocations.
Note: In the future, this will evolve towards generating deallocation ops only for buffer allocations that do not escape block boundaries (i.e., that are in destination passing style).
Differential Revision: https://reviews.llvm.org/D116450
There was a limitation in legality that in the original inner loop latch,
no instruction was allowed between the induction variable increment
and the branch instruction. This is because we used to split the
inner latch at the induction variable increment instruction. Since
now we have split at the inner latch branch instruction and have
properly duplicated instructions over to the split block, we remove
this limitation.
Please refer to the test case updates to see how we now interchange
loops where instructions exist between the induction variable
increment and the branch instruction.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D115238
Include the complete list of threads of all running processes
in the FreeBSDKernel plugin. This makes it possible to inspect
the states (including partial register dumps from PCB) of all kernel
and userspace threads at the time of crash, or at the time of reading
/dev/mem first.
Differential Revision: https://reviews.llvm.org/D116255
One of our internal arm64 apps hit a thunk out of range error when building
with LLD. Per the comment, I'm arbitrarily increasing slop size to 256.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D116705
The old function names (e.g., `replaceOp`) could have been confusing to users because they sound similar to rewriter functions, but have slightly different semantics.
Differential Revision: https://reviews.llvm.org/D116449
Currently, it's inconsistent that warnings are disabled if they
come from system headers, unless they come from macros.
Typically a user cannot act upon these warnings coming from
system macros, so clang-tidy should ignore them unless the
user specifically requests warnings from system headers
via the corresponding configuration.
This change broke the ProTypeVarargCheck check, because it
was checking for the usage of va_arg indirectly, expanding it
(it's a system macro) to detect the usage of __builtin_va_arg.
The check has been fixed by checking directly what the rule
is about: "do not use va_arg", by adding a PP callback that
checks if any macro with name "va_arg" is expanded. The old
AST matcher is still kept for compatibility with Windows.
Add unit test that ensures warnings from macros are disabled
when not using the -system-headers flag. Document the change
in the Release Notes.
Differential Revision: https://reviews.llvm.org/D116378
In the test files, replace the old-style tests with a simple static_assert,
matching the current style as depicted in e.g.
`ranges_uninitialized_default_construct.pass.cpp`.
Preserve `is_function_like` (but renamed to `is_niebloid`) at
ldionne's request. The removal of this test helper will happen
in D116570 if at all.
Differential Revision: https://reviews.llvm.org/D116384
No need to include the order of the scalars beeing used as part of the
alternate vectorization into account when trying to reorder the whole
graph. Such elements better to reorder in the following phase because
the subtree still ends up in shuffle.
Part of D116688, fixes the regression in D116690.
Differential Revision: https://reviews.llvm.org/D116740
There is a bug in the reordering analysis stage. If the element with the
given hash is not added to the map but has the same number of APOs and
instructions with same parent, but different instruction opcode, it will
be initalized with default values and then the counter is increased by
1. But the lane is not updated and default to 0 instead of the actual
`Lane` value. It leads to the fact that the analysis is useless in
many cases and default to lane 0 instead of actual lane with the
minimum amount of APO operands.
Differential Revision: https://reviews.llvm.org/D116690
`tensor.collapse_shape` op when fused with a consumer elementwise
`linalg.generic` operation results in creation of tensor.expand_shape
ops. In purely dynamic cases this can end up with a dynamic dimensions
being expanded to more than one dynamic dimension. This is disallowed
by the semantics of `tensor.expand_shape` operation. (While the
transformation is itself correct, its a gap in the specification of
`tensor.expand_shape` that is the issue). So disallow fusions which
result in such a pattern.
Differential Revision: https://reviews.llvm.org/D116703
This adds some AArch64 specific smul_with_overflow and umul_with_overflow
costs, overriding the default costs. The code generation for these mul
with overflow intrinsics is usually better than the default expansion on
AArch64. The costs come from https://godbolt.org/z/zEzYhMWqo with various
types, or llvm/test/CodeGen/AArch64/arm64-xaluo.ll.
Differential Revision: https://reviews.llvm.org/D116732
This should have been done in 6a6a80e88e, but buildkite was down so I
hadn't noticed. This brings this test file into line with several others
in this directory.
dyn_cast<> can return null - use cast<> instead to assert the cast is valid before dereferencing the casted pointer.
Fixes static-analyzer null dereference warning.
I am suspecting a bug around updates of loop info for unreachable exits, but don't have a test case. Running this locally on make check didn't reveal anything, we'll see if the expensive checks bots find it.
This patch updates the `flang` bash scripts to differentiate between
object files provided by the user and intermediate object files
generated by the script. The latter are an "implementation detail" that
should not be visible to the end user (i.e. deleted before the scripts
exits). The former should be preserved.
Fixes https://github.com/flang-compiler/f18-llvm-project/issues/1348
Differential Revision: https://reviews.llvm.org/D116590
The 0 immediate can't be selected to vmsgtu.vi/vmsleu.vi by decrementing
the immediate. To prevent his we had special patterns that provided
alternate lowering for the 0 cases. This relied on tablegen prioritizing
the 0 pattern over the sim5_plus1 range.
This patch introduces simm5_plus1_nonzero that excludes 0. It also
excludes the special case for vmsltu.vi since we can just use
vmsltu.vx and let the 0 be selected to X0.
This is an alternative to some of the changes in D116584.
Reviewed By: Chenbing.Zheng, asb
Differential Revision: https://reviews.llvm.org/D116723
Include the value of `ZLIB_ROOT` in `LLVMConfig.cmake` so `FindZLIB` can pick it up. This fixes an issue where ZLIB is not found on AIX runtimes despite specifying `-DZLIB_ROOT`.
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D116235
Function calls and compare instructions tend to cause sext.w
instructions to be inserted. If we make good use of W instructions,
these operations can often end up being redundant. We don't always
detect these during SelectionDAG due to things like phis. There also
some cases caused by failure to turn extload into sextload in
SelectionDAG. extload selects to LW allowing later sext.ws to become
redundant.
This patch adds a pass that examines the input of sext.w instructions trying
to determine if it is already sign extended. Either by finding a
W instruction, other instructions that produce a sign extended result,
or looking through instructions that propagate sign bits. It uses
a worklist and visited set to search as far back as necessary.
Reviewed By: asb, kito-cheng
Differential Revision: https://reviews.llvm.org/D116397
The zextload hook is only used to determine whether to insert a
zero_extend or any_extend for narrow types leaving a basic block.
Returning true from this hook tends to cause any load whose output
leaves the basic block to become an LWU instead of an LW.
Since we tend to prefer sexts for i32 compares on RV64, this can
cause extra sext.w instructions to be created in other basic blocks.
If we use LW instead of LWU this gives the MIR pass from D116397
a better chance of removing them.
Another option might be to teach getPreferredExtendForValue in
FunctionLoweringInfo.cpp about our preference for sign_extend of
i32 compares. That would cause SIGN_EXTEND to be chosen for any
value used by a compare instead of using the isZExtFree heuristic.
That will require code to convert from the llvm::Type* to EVT/MVT
as well as querying the type legalization actions to get the
promoted type in order to call TargetLowering::isSExtCheaperThanZExt.
That seemed like many extra steps when no other target wants it.
Though it would avoid us needing to lean on the MIR pass in some cases.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D116567
This change simplifies BufferizableOpInterface and other functions. Overall, the API will get smaller: Functions related to custom IR traversal are deleted entirely. This will makes it easier to write BufferizableOpInterface implementations.
This is also in preparation of unifying Comprehensive Bufferize and core bufferization. While Comprehensive Bufferize could theoretically maintain its own IR traversal, there is no reason to do so, because all bufferize implementations in BufferizableOpInterface have to support partial bufferization anyway. And we can share a larger part of the code base between the two bufferizations.
Differential Revision: https://reviews.llvm.org/D116448
This is mostly for documentation purposes: Passing the object as a const reference signifies that analysis decisions cannot be changed after the analysis.
Differential Revision: https://reviews.llvm.org/D116742