If the shift amount has been zero-extended, peek through as this might help us further canonicalize the shift amount.
Fixes regression mentioned in rG147cfcbef1255ba2b4875b76708dab1a685085f5
Pass LIBCXX_HAS_PTHREAD_LIB, LIBCXX_HAS_RT_LIB and LIBCXXABI_HAS_PTHREAD_LIB
through to the custom lib++ builds so that libfuzzer doesn't end up with a .deplibs section that
links against those libraries when the variables are set to false.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D120946
With modern architectures having a thread pointer and language supporting
thread locals, there is no reason to use a function intermediary to access
the thread local errno value.
The entrypoint corresponding to errno has been replaced with an object
library as there is no formal entrypoint for errno anymore.
Reviewed By: jeffbailey, michaelrj
Differential Revision: https://reviews.llvm.org/D120920
Adds `MatchSwitch`, a library for simplifying implementation of transfer
functions. `MatchSwitch` supports constructing a "switch" statement, where each
case of the switch is defined by an AST matcher. The cases are considered in
order, like pattern matching in functional languages.
Differential Revision: https://reviews.llvm.org/D120900
This patch adds a simpe lattice used to collect source loctions. An intended application is to track errors found in code during an analysis.
Differential Revision: https://reviews.llvm.org/D120890
This patch extends the existing if combining canonicalization to also handle the case where a value returned by the first if is used within the body of the second if.
This patch also extends if combining to support if's whose conditions are logical negations of each other.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120924
When lowering LLVM-IR to instruction referencing stuff, if a value is
defined by a COPY, we try and follow the register definitions back to where
the value was defined, and build an instruction reference to that
instruction. In a few scenarios (such as arguments), this isn't possible.
I added some assertions to catch cases that weren't explicitly whitelisted.
Over the course of a few months, several more scenarios have cropped up,
the lastest is the llvm.read_register intrinsic, which lets LLVM-IR read an
arbitary register at any point. In the face of this, there's little point
in validating whether debug-info reads a register in an expected scenario.
Thus: this patch just deletes those assertions, and adds a regression test
to check that something is done with the llvm.read_register intrinsic.
Fixes#54190
Differential Revision: https://reviews.llvm.org/D121001
This completes the removal of uses of SelectionDAG::getSplatValue started in D119090 - by avoiding extracting the splatted element we make it a lot easier to zero-extend the bottom 64-bits of the shift amount and fixes issues we had on 32-bit targets where i64 isn't legal.
I've removed the old version of getTargetVShiftNode that took the scalar shift amount argument and LowerRotate can finally efficiently handle vXi16 rotates-by-scalar (using the same code as general funnel-shifts).
The only regression we see is in the X86-AVX2 PR52719 test case in vector-shift-ashr-256.ll - this is now hitting the same problem as the X86-AVX1 case (failure to simplify a multi-use X86ISD::VBROADCAST_LOAD) which I intend to address in a follow up patch.
It caused builds to assert with:
(StackSize == 0 && "We already have the CFA offset!"),
function generateCompactUnwindEncoding, file AArch64AsmBackend.cpp, line 624.
when targeting iOS. See comment on the code review for reproducer.
> This patch rearranges emission of CFI instructions, so the resulting
> DWARF and `.eh_frame` information is precise at every instruction.
>
> The current state is that the unwind info is emitted only after the
> function prologue. This is fine for synchronous (e.g. C++) exceptions,
> but the information is generally incorrect when the program counter is
> at an instruction in the prologue or the epilogue, for example:
>
> ```
> stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
> mov x29, sp
> .cfi_def_cfa w29, 16
> ...
> ```
>
> after the `stp` is executed the (initial) rule for the CFA still says
> the CFA is in the `sp`, even though it's already offset by 16 bytes
>
> A correct unwind info could look like:
> ```
> stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
> .cfi_def_cfa_offset 16
> mov x29, sp
> .cfi_def_cfa w29, 16
> ...
> ```
>
> Having this information precise up to an instruction is useful for
> sampling profilers that would like to get a stack backtrace. The end
> goal (towards this patch is just a step) is to have fully working
> `-fasynchronous-unwind-tables`.
>
> Reviewed By: danielkiss, MaskRay
>
> Differential Revision: https://reviews.llvm.org/D111411
This reverts commit 32e8b550e5.
Remove the executable name from the test match as this will have
a `.exe` suffix on windows.
Reviewed By: drodriguez
Differential Revision: https://reviews.llvm.org/D121000
This patch makes coalesce skip the comparison of all pairs of IntegerPolyhedrons with LocalIds rather than crash. The heuristics to handle these cases will be upstreamed later on.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D120995
Add the -o flag to specify an output path for llvm-bitcode-strip.
This matches the interface to the Xcode bitcode_strip tool.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D120731
We can simplify an extractvalue of an insertvalue to extract out of the base of the insertvalue, if the insert and extract are at distinct and non-prefix'd indices
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D120915
Prior to this change LLVM would happily elide a call to any allocation
function and a call to any free function operating on the same unused
pointer. This can cause problems in some obscure cases, for example if
the body of operator::new can be inlined but the body of
operator::delete can't, as in this example from jyknight:
#include <stdlib.h>
#include <stdio.h>
int allocs = 0;
void *operator new(size_t n) {
allocs++;
void *mem = malloc(n);
if (!mem) abort();
return mem;
}
__attribute__((noinline)) void operator delete(void *mem) noexcept {
allocs--;
free(mem);
}
void deleteit(int*i) { delete i; }
int main() {
int*i = new int;
deleteit(i);
if (allocs != 0)
printf("MEMORY LEAK! allocs: %d\n", allocs);
}
This patch addresses the issue by introducing the concept of an
allocator function family and uses it to make sure that alloc/free
function pairs are only removed if they're in the same family.
Differential Revision: https://reviews.llvm.org/D117356
In dd875dd88b I added a missing MLIR
dependency in Flang. However, that particular CMake target is not
exported as something available to standalone builds. In this patch is
switch to `MLIRIR` instead, which depends on
`MLIRBuiltinAttributeInterfacesIncGen` - the missing dependency added
previously.
Differential Revision: https://reviews.llvm.org/D120986
In addressing the buffer ownership API, I discovered a rogue member
function that returned by value rather than by reference. It clearly
intended to return by reference, but because the copy ctor wasn't
deleted this wasn't caught.
It is not necessary to make this a move-only type, although that would
be an alternative.
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D120901
This commit divides the large test files(over 30k lines) under clang/test/CodeGen/RISCV including:
rvv-intrinsics/vloxseg.c
rvv-intrinsics/vluxseg.c
rvv-intrinsics-overloaded/vloxseg.c
rvv-intrinsics-overloaded/vluxseg.c
into "non-masked" version and "masked" version which can reduce the test cases by 50% in a single file.
Differential Revision: https://reviews.llvm.org/D120967
Don't override SubtargetPredicate since that is already set in the
base classes for the appropriate subtarget like MIMG_gfx10. Use
OtherPredicates instead for consistency with the way we handle
features like HasImageInsts and HasExtendedImageInsts. NFC.
Differential Revision: https://reviews.llvm.org/D120909
When triggered during operation legalisation the affected combine
generates a splat_vector that when custom lowered for SVE fixed
length code generation, results in the original precombine sequence
and thus we enter a legalisation/combine hang.
NOTE: The patch contains no tests because I observed this issue
only when combined with other work that might never become public.
The current way AArch64 lowers ISD::SPLAT_VECTOR meant a specific
test was not possible so I'm hoping the DAGCombiner fix can be seen
as obvious. The AArch64ISelLowering change is requirted to maintain
existing code quality.
Differential Revision: https://reviews.llvm.org/D120735
This test file has grown to the point where it takes a huge amount of
time to run. At the moment, this test seems to consistently time out
when running in the pre-commit checks in Phabricator with a 10 minute
timeout. For example see
https://reviews.llvm.org/harbormaster/unit/view/2832724/
While splitting up the test file is not ideal, it is even more
undesirable to have huge test files that time out in common settings.
This patch splits up the test file roughly in the middle.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D120876
This test file has grown to the point where it takes a huge amount of
time to run. At the moment, this test seems to consistently time out
when running in the pre-commit checks in Phabricator with a 10 minute
timeout. For example see
https://reviews.llvm.org/harbormaster/unit/view/2832723/
While splitting up the test file is not ideal, it is even more
undesirable to have huge test files that time out in common settings.
This patch splits up the test file roughly in the middle.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D120875
This turns upz1(x, undef) to concat(truncate(x), undef), as the truncate
is simpler and can often be optimized away, and it helps some of the
insert-subvector tests optimize more cleanly.
Differential Revision: https://reviews.llvm.org/D120879
This check is not relevant for correctness, it can only avoid
walking some recursive uses if the cast is to a non-function
pointer type. As this distinction will no longer be possible
with opaque pointers and all users will have to be walked
anyway, I'm dropping the check in advance.
Previous and OhterPrev may not be in the same block. Use DT::dominates
instead of local comesBefore. DT::dominates is already used earlier to
check the order of Previous and SinkCandidate.
Fixes https://github.com/llvm/llvm-project/issues/54195
Recommit without changes over 53abe3ff66,
which addressed the cause of the reported crash.
-----
With opaque pointers, the zero-offset load will generally not use
a GEP. Allow a direct load without GEP, which is treated the same
way as a zero-offset GEP.
A bug was introduced in 5ea35791e6 that
gave the wrong name and description for TuneExynosM4. This patch fixes
that and changes it back to m3.
Differential Revision: https://reviews.llvm.org/D120665
Use the overload that support moving into an empty block. I don't
think that this situation can occur right now, but it can happen
with the change from e7fb1c15cb,
and the test is derived from the issue reported there.
When the stack has SVE objects, fixed-width objects are often better accessed
from the SP, instead of the FP, because part/all of the fixed-width offset
can be folded into the (non-scalable) addressing mode, where otherwise an
ADDVL would be required.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D120738
In this case, the access would benefit from being accessed from the SP, as that
would avoid the redundant ADDVL, since most of the offset can currently be
folded into the addressing mode.
This completes the propagation of type IDs through bitcode reading,
and switches remaining uses of getPointerElementType() to use
contained type IDs.
The main new thing here is that sometimes we need to create a type
ID for a type that was not explicitly encoded in bitcode (or we
don't know its ID at the current point). For such types we create a
"virtual" type ID, which is cached based on the type and the
contained type IDs. Luckily, we generally only need zero or one
contained type IDs, and in the one case where we need two, we can
get away with not including it in the cache key.
With this change, we pass the entirety of llvm-test-suite at O3
with opaque pointers.
Differential Revision: https://reviews.llvm.org/D120471