This patch adds an initial ShuffleVectorInst::isInsertSubvectorMask helper to recognize 2-op shuffles where the lowest elements of one of the sources are being inserted into the "in-place" other operand, this includes "concat_vectors" patterns as can be seen in the Arm shuffle cost changes. This also helped fix a x86 issue with irregular/length-changing SK_InsertSubvector costs - I'm hoping this will help with D107188
This doesn't currently attempt to work with 1-op shuffles that could either be a "widening" shuffle or a self-insertion.
The self-insertion case is tricky, but we currently always match this with the existing SK_PermuteSingleSrc logic.
The widening case will be addressed in a follow up patch that treats the cost as 0.
Masks with a high number of undef elts will still struggle to match optimal subvector widths - its currently bounded by minimum-width possible insertion, whilst some cases would benefit from wider (pow2?) subvectors.
Differential Revision: https://reviews.llvm.org/D107228
When the limit of the inner loop is a known integer, the InstCombine
pass now causes the transformation e.g. imcp ult i32 %inc, tripcount ->
icmp ult %j, tripcount-step (where %j is the inner loop induction
variable and %inc is add %j, step), which is now accounted for when
identifying the trip count of the loop. This is also an acceptable use
of %j (provided the step is 1) so is ignored as long as the compare
that it's used in is also the condition of the inner branch.
Differential Revision: https://reviews.llvm.org/D105802
If the block target for a WLSTP instruction is known to be out of range,
and cannot be fixed by the ARMBlockPlacementPass, we can relax it to a
DLSTP (and cmp/branch) to still allow the creation of tail predicated
loops. That is what this patch does, adding extra revert code to the
fallback path of ARMBlockPlacementPass.
Due to the code produced when reverting, this creates a DLSTP between a
Bcc and a Br. As a DLS isn't necessarily a terminator we need to split
the block to move the DLS/Br into.
Differential Revision: https://reviews.llvm.org/D104709
The encoding used for opening files depends on the OS and might be different
from UTF-8 (e.g. on Windows it can be CP-1252). The documentation files use
UTF-8 and might be incompatible with other encodings. For example, right now
`clang-tools-extra/docs/clang-tidy/checks/abseil-no-internal-dependencies.rst`
has non-ASCII quotes and running `add_new_check.py` fails on Windows, because
it tries to read the file with incompatible encoding.
Use `io.open` for compatibility with both Python 2 and Python 3.
Reviewed By: kbobyrev
Differential Revision: https://reviews.llvm.org/D106792
ProcessPendingSignals is called in all interceptors
and user atomic operations. Make the fast-path check
(no pending signals) inlinable.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107217
We might want to use info from GC strategy in middle end analysis.
The motivation for this is provided in D99135: we may want to ask
a GC if it's going to work with a given pointer (currently this code
makes naive check by the method name).
Differetial Revision: https://reviews.llvm.org/D100559
Reviewed By: reames
Happens when DestContext is LinkageSpecDecl and hense CurContext happens to be
both not TagDecl and NamespaceDecl.
Minimal reproducer: trigger define outline in
```
namespace ns {
extern "C" {
typedef int foo;
}
foo Fo^o(int id) { return id; }
}
```
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D107047
When compiling on ZLinux, we got this error:
/llvm-project/llvm/examples/OrcV2Examples/LLJITWithRemoteDebugging/ \
RemoteJITUtils.h:80:65: required from here...
/usr/include/c++/7/bits/unique_ptr.h:76:22: error: invalid application of
'sizeof' to incomplete type 'llvm::orc::RemoteExecutorProcessControl'
static_assert(sizeof(_Tp)>0,
This patch just removes nullptr from the initialization of
std::unique_ptr<RemoteExecutorProcessControl> to avoid the issue.
Patch by Tung D. Le (tung@jp.ibm.com). Thanks Tung!
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107247
Application of default mapping to BVH intrinsics was missing.
Copy parts of SelectionDAG test to GlobalISel test as these would
have indicated this error.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D107211
Test dependent on pexpect fail randomly with timeouts on Arm/AArch64 Linux
buildbots. I am setting pexpect timeout from 30 to 60.
I will revert this back if this doesnt improve random failures.
This reverts commit fd18f0e84c.
I reverted this change to see its effect on failing GUI tests on LLDB
Arm/AArch64 Linux buildbots. I could not find any evidence against this
particular change so reverting it back.
Differential Revision: https://reviews.llvm.org/D100243
Following tests have been failing randomly on LLDB Arm and AArch64 Linux
builtbots:
TestMultilineNavigation.py
TestMultilineCompletion.py
TestIOHandlerCompletion.py
TestGuiBasic.py
I have increased allocated CPU resources to these bots but it has not
improved situation to an acceptable level. This patch marks them as
skipped on Arm/AArch64 for now.
The -fms-extensions converts __pragma (and _Pragma) into a #pragma that
has to occur at the beginning of a line and end with a newline. This
patch ensures that the newline after the #pragma is added even if
Token::isAtStartOfLine() indicated that we should not start a newline.
Committing relying post-commit review since the change is small, some
downstream uses might be blocked without this fix, and to make clear the
decision of the new -fminimize-whitespace feature (fix on main, revert
on clang-13.x branch) suggested by @aaron.ballman in D104601.
Differential Revision: https://reviews.llvm.org/D107183
Use `clang_target_link_libraries` to avoid duplicate libraries when
the same symbol is provided both by a static library and a larger
dylib, fixing linking with win32 dylibs. This fixes errors like
these:
ld.lld: error: duplicate symbol: llvm::createStringError(std::__1::error_code, char const*)
>>> defined at libLLVMSupport.a(Error.cpp.obj)
>>> defined at libLLVM-14git.dll
This matches how other clang tools declare their dependencies.
Differential Revision: https://reviews.llvm.org/D107231
Change `ThreadPlanStack::PopPlan` and `::DiscardPlan` to not do the following:
1. Move the last plan, leaving a moved `ThreadPlanSP` in the plans vector
2. Operate on the last plan
3. Pop the last plan off the plans vector
This leaves a period of time where the last element in the plans vector has been moved. I am not sure what, if any, guarantees there are when doing this, but it seems like it would/could leave a null `ThreadPlanSP` in the container. There are asserts in place to prevent empty/null `ThreadPlanSP` instances from being pushed on to the stack, and so this could break that invariant during multithreaded access to the thread plan stack.
An open question is whether this use of `std::move` was the result of a measure performance problem.
Differential Revision: https://reviews.llvm.org/D106171
The `DataLayout` class currently contains the member `layoutStack` which is hidden behind a preprocessor region dependant on the NDEBUG macro. Code wise this makes a lot of sense, as the `layoutStack` is used for extra assertions that users will want when compiling a debug build.
It however has the uncomfortable consequence of leading to a different ABI in Debug and Release builds. This I think is a bit annoying for downstream projects and others as they may want to build against a stable Release of MLIR in Release mode, but be able to debug their own project depending on MLIR.
This patch changes the related uses of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS. As the macro is computed at configure time of LLVM, it may not change based on compiler settings of a downstream projects like NDEBUG would.
Differential Revision: https://reviews.llvm.org/D107227
These are some of the basic cases taken from X86.
We currently fail to use lookup tables on many of these cases
because SimplifyCFG requires a legal type to do the transform and
RISCV only has one legal integer type.
I'm working on extending the OptimizeGlobalAddressOfMalloc to handle some more general cases. This is to add support of the ConstantExpr use of the global variables. The function allUsesOfLoadedValueWillTrapIfNull is now iterative with the added CE use of GV. Also, the recursive function valueIsOnlyUsedLocallyOrStoredToOneGlobal is changed to iterative using a worklist with the GEP case added.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D106589
Currently, the default alignment is much larger than the actual size of
the vector in memory. Fix this to use a sane default.
For SVE, temporarily remove lowering of load/store operations for
predicates with less than 16 elements. The layout the backend was
assuming for SVE predicates with less than 16 elements doesn't agree
with the frontend. More work probably needs to be done here.
This change is, strictly speaking, not backwards-compatible at the
bitcode level. But probably nobody is actually depending on that; i1
vectors in memory are rare, and the code that does use them probably
ends up forcing the alignment to something sane anyway. If we think
this is a concern, I can restrict this to scalable vectors for now
(where it's actually causing issues for me at the moment).
Differential Revision: https://reviews.llvm.org/D88994
Target-dependent constant folding will fold these down to simple
constants (or at least, expressions that don't involve a GEP). We don't
need heroics to try to optimize the form of the expression before that
happens.
Fixes https://bugs.llvm.org/show_bug.cgi?id=51232 .
Differential Revision: https://reviews.llvm.org/D107116
Introduces a conversion from one (sparse) tensor type to another
(sparse) tensor type. See the operation doc for details. Actual
codegen for all cases is still TBD.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D107205
The check for size_t parameter 1 was already here for snprintf_chk,
but it wasn't applied to regular snprintf. This could lead to
mismatching and eventually crashing as shown in:
https://llvm.org/PR50885
I don't know much about this pass, but we need a stronger
check on the memset length arg to avoid an assert. The
current code was added with D59000.
The test is reduced from:
https://llvm.org/PR50910
Differential Revision: https://reviews.llvm.org/D106462