The original issue is caused by the fact that the variable is allocated
with incorrect type i1 instead of i8. This causes the bitcasting of the
declaration to i8 type and the bitcast expression does not match the
original variable.
To fix the problem, the UndefValue initializer and the original
variable should be emitted with type i8, not i1.
Differential Revision: https://reviews.llvm.org/D99297
On z/OS there is a hard limitation on on the maximum requestable alignment in aligned attribute for static variables. We need to truncate values greater than that.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D98864
This is a 2nd try of:
3c8473ba53
which was reverted at:
a26312f9d4
because of crashing.
This version includes extra code and tests to avoid the known
crashing examples as discussed in PR49730.
Original commit message:
As noted in D98152, we need to patch SLP to avoid regressions when
we start canonicalizing to integer min/max intrinsics.
Most of the real work to make this possible was in:
7202f47508
Differential Revision: https://reviews.llvm.org/D98981
Need to insert a basic block during generation of the target region to
avoid crash for the GPU to be able always calling a cleanup action.
This cleanup action is required for the correct emission of the target
region for the GPU.
Differential Revision: https://reviews.llvm.org/D99445
Specifically, use these metafunctions consistently in areas that are
about to be affected by P1518R2's changes.
This is the NFCI part of https://reviews.llvm.org/D97742 .
The functional-change part is still waiting for P1518R2 to be
officially merged into the working draft.
This removes the need for the remaining doesNotMeet check and instead
directly checks if there are too many runtime checks for vectorization
in the planner.
A subsequent patch will adjust the logic used to decide whether to
vectorize with runtime to consider their cost more accurately.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D98634
Using $ breaks demangling of the symbols. For example,
$ c++filt _Z3foov\$123
_Z3foov$123
This causes problems for developers who would like to see nice stack traces
etc., but also for automatic crash tracking systems which try to organize
crashes based on the stack traces.
Instead, use the period as suffix separator, since Itanium demanglers normally
ignore such suffixes:
$ c++filt _Z3foov.123
foo() [clone .123]
This is already done in some places; try to do it everywhere.
Differential revision: https://reviews.llvm.org/D97484
This is an LLDB test for the ASTImporter crash that got fixed in D99077.
The test is using Clang modules for the properties as it seems the conflicting
names are not actually correctly handled when generating debug information
(only the first property is emitted and the second one is ignored in the current
clang ToT).
The following operations have no associated cost for them
when applied to scalable vectors, and as a consequence
can trigger a crash when a call is made to
AArch64TTIImpl::getCastInstrCost():
- fptrunc
- trunc
- fpext
- fpto(u,s)i
This patch adds costs for these operations and
relevant regression tests.
Differential Revision: https://reviews.llvm.org/D98934
Commit 6bc1e69de2 changed the search string
to also check for the generated strings that surround the plain assert:
Assertion `false && "lldb-test assert"' failed
^^^^^^^^^
This causes the test to fail on setups where the generated assert message
looks different. E.g., on macOS the generated message looks like this:
Assertion failed: (false && "lldb_assert failed"), function lldb_assert
This reverts the old behaviour of just checking for the actual string we
have inside LLDB.
Add a new clone operation to the memref dialect. This operation implicitly
copies data from a source buffer to a new buffer. In contrast to the linalg.copy
operation, this operation does not accept a target buffer as an argument.
Instead, this operation performs a conceptual allocation which does not need to
be performed manually.
Furthermore, this operation resolves the dependency from the linalg-dialect
in the BufferDeallocation pass. In addition, we also extended the canonicalization
patterns to fold clone operations. The copy removal pass has been removed.
Differential Revision: https://reviews.llvm.org/D99172
This extends the recent MVE lane interleaving passto handle other
non-instruction leaves, for which a new shuffle is added. This helps
especially for constants and potentially for arguments.
Differential Revision: https://reviews.llvm.org/D97289
LLDB on Linux built with symbols is showing this error.
Without symbols it still PASSes:
lldb-test: .../lldb/source/Utility/LLDBAssert.cpp:29: void lldb_private::lldb_assert(bool, const char *, const char *, const char *, unsigned int): Assertion `false && "lldb_assert failed"' failed.
With symbols it FAILs:
lldb-test: .../lldb/tools/lldb-test/lldb-test.cpp:1086: int opts::assert::lldb_assert(lldb_private::Debugger &): Assertion `false && "lldb-test assert"' failed.
Differential Revision: https://reviews.llvm.org/D99462
LLVMOrcDisposeObjectLayer and LLVMOrcExecutionSessionGetJITDylibByName did not
have matching signatures between the C-API header and binding implementations.
Fixes http://llvm.org/PR49745.
Patch by Mats Larsen. Thanks Mats!
Reviewed by: lhames
Differential Revision: https://reviews.llvm.org/D99478
This follows GCC and simplifies code. /usr/local/include and TOOL_INCLUDE_DIR
should not conflict with the resource directory include so users should not
observe any difference.
This can only happen if offset types that are larger than the
pointer size are involved. The previous implementation did not
assert in this case because it initialized the APInts to the
width of one of the variables -- though I strongly suspect it
did not compute correct results in this case.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32621
reported by fhahn.
For these cases we need to extract the upper or lower elements,
multiply them using 16-bit multiplies and repack them.
Previously we used punpcklbw/punpckhbw+psraw or pmovsxbw+pshudfd to
extract and sign extend so we could use pmullw to compute the 16-bit
product and then shift down the high bits.
We can avoid the need to sign extend if we unpack the bytes into
the high byte of each word and fill the lower byte with 0 using
pxor. This puts the sign bit of each byte into the sign bit of
each word. Since the LHS and RHS have 8 trailing zeros, the full
32-bit product of those 16-bit values will have 16 trailing zeros.
This means the 16-bit product of the original bytes is in the upper
16 bits which we can calculate using pmulhw.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D98587
While working on D97208 I noticed that these greedy regular
expressions prevent tests from failing when (%rip) appears after
a constant pool label when it didn't before.
Reviewed By: RKSimon, pengfei
Differential Revision: https://reviews.llvm.org/D99460
MVE does not have a single sext/zext or trunc instruction that takes the
bottom half of a vector and extends to a full width, like NEON has with
MOVL. Instead it is expected that this happens through top/bottom
instructions. So the MVE equivalent VMOVLT/B instructions take either
the even or odd elements of the input and extend them to the larger
type, producing a vector with half the number of elements each of double
the bitwidth. As there is no simple instruction for a normal extend, we
often have to expand sext/zext/trunc into a series of lane moves (or
stack loads/stores, which we do not do yet).
This pass takes vector code that starts at truncs, looks for
interconnected blobs of operations that end with sext/zext and
transforms them by adding shuffles so that the lanes are interleaved and
the MVE VMOVL/VMOVN instructions can be used. This is done pre-ISel so
that it can work across basic blocks.
This initial version of the pass just handles a limited set of
instructions, not handling constants or splats or FP, which can all come
as extensions to this base.
Differential Revision: https://reviews.llvm.org/D95804
This follows GCC. Having libstdc++/libc++ include paths is not useful
anyway because libstdc++/libc++ header files cannot find features.h.
While here, suppress -stdlib++-isystem with -nostdlibinc.
RVV intrinsics has new overloading rule, please see
82aac7dad4
Changed:
1. Rename `generic` to `overloaded` because the new rule is not using C11 generic.
2. Change HasGeneric to HasNoMaskedOverloaded because all masked operations
support overloading api.
3. Add more overloaded tests due to overloading rule changed.
Differential Revision: https://reviews.llvm.org/D99189