We already do this for splat nodes that carry a VL, but not for
splats that use VLMAX.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D115483
Introduce a FreeBSDKernel plugin that provides the ability to read
FreeBSD kernel core dumps. The plugin utilizes libfbsdvmcore to provide
support for both "full memory dump" and minidump formats across variety
of architectures supported by FreeBSD. It provides the ability to read
kernel memory, as well as the crashed thread status with registers
on arm64, i386 and x86_64.
Differential Revision: https://reviews.llvm.org/D114911
LLVM forks may use GitHub Actions as well as the upstream projects,
but they do not necessarily follow the same development processes.
Disable automatic issue labeling for forks, so that it does not
interfere with downstream repo automation.
Reviewed By: tstellar
Differential Revision: https://reviews.llvm.org/D115708
If a loop isn't forced to be unrolled, we want to avoid unrolling it when there
is an explicit unroll-and-jam pragma. This is to prevent automatic unrolling
from interfering with the user requested transformation.
Differential Revision: https://reviews.llvm.org/D114886
These flags are documented as generating poison values for particular input values. As such, we should really be consistent about their handling with how we handle nsw/nuw/exact/inbounds.
Differential Revision: https://reviews.llvm.org/D115460
When possible, optimize TRUNCATE to generate Wasm SIMD narrow
instructions (i16x8.narrow_i32x4_u, i8x16.narrow_i16x8_u), rather than generate
lots of extract_lane and replace_lane.
Closes#50350.
The library is always build using C++20 so these guards are not needed.
Reviewed By: #libc, Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D115644
data point using the 3-dim tensor nell-2.tns
MLIR:
READ FILE INTO COO: 24424.369294 ms ---> improves to ----> 9638.501044 ms
SORT COO BEFORE PACK: 762.834831 ms
PACK COO TO TENSOR: 1243.376245 ms
TACO:
b file read: 13270.9 ms
b pack: 7137.74 ms
b size: (12092 x 9184 x 28818), 925300328 bytes
https://github.com/llvm/llvm-project/issues/52679
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D115696
As noted in the code comment, we might want to simply give up on this select
transform completely (given how many exceptions there are already and the
risk of future conflicts), but for now, carve out one more bailout to
avoid an infinite loop.
Fixes#52684:
https://github.com/llvm/llvm-project/issues/52684
The transform can require an optional shuffle instruction to be sound,
so we need to use Builder to create all values and then replace the
original instruction with whatever that final value is.
Bazel support is through utils/bazel, and the BUILD files in `benchmark`
just complicate downstream integrates for bazel based builds.
Differential Revision: https://reviews.llvm.org/D115733
As explained in https://stackoverflow.com/a/70339311/627587, the fact
that shrink_to_fit wasn't defined as inline lead to issues when explicitly
instantiating basic_string. While explicit instantiations are always
somewhat brittle, this one was clearly a bug on our end.
Differential Revision: https://reviews.llvm.org/D115656
This change moves optimized callbacks from each .o file to compiler-rt. Instead of using code generation it uses direct assembly implementation. Please note that the 'or' version is not implemented and it will produce unresolved external if somehow 'or' version is requested.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D114558
If either operand has an element with allbits set, then we don't need the equivalent element from the other operand, as allbits are guaranteed to be set.
This reverts commit ac60263ad1.
It looks like the test fails on certain non-Darwin system, even though
the triple is explicitly set to macos. Revert while I investigate.
The LangRef incorrectly says that if no exact match is found when
seeking alignment for a vector type, the largest vector type smaller
than the sought-after vector type. This is incorrect as vector types
require an exact match, else they fall back to reporting the natural
alignment.
The corrected rule was not added in its place, as rules for other types
(e.g., floating-point types) aren't documented.
A unit test was added to demonstrate this.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D112463
memset_pattern{4,8,16} writes to the first argument. Use getForDest
to return the corresponding MemoryLocation.
Reviewed By: ab
Differential Revision: https://reviews.llvm.org/D114906
MIPS64 little endian target has a "special" encoding of `r_info`
relocation record field. Instead of one 64-bit little endian number, it
is a little endian 32-bit number followed by a 32-bit big endian number.
For correct reading and writing such fields we must provide information
about target machine into the corresponding routine. This patch does
this for the `llvm-objcopy` tool and fix handling of MIPS64 little
endian files.
The bug was reported in the issue #52647.
Differential Revision: https://reviews.llvm.org/D115635
Changes the preliminary multinode analysis:
1. Introduced scores for reversed loads/extractelements.
2. Improved shallow score calculation.
3. Lowered the cost of external uses (no need to consider it several times, just ones).
4. The initial lane for analysis is the one with the minimal possible
reorderings.
These changes in general shall reduce compile time and improve the
reordering in many cases.
Part of D57059.
Differential Revision: https://reviews.llvm.org/D101109
Summary:
As pointed out in https://github.com/llvm/llvm-project/issues/52610, the
changes to GraphWriter made in https://reviews.llvm.org/D87202 introduced
some bad code and the changes are actually unnecessary for
-print-changed=dot-cfg. The code in question is guarded by a call to
DefaultDOTGraphTraits::hasEdgeDestLabels() which is always false when the
graph writer is called for -print-changed=dot-cfg. Revert this section of
the changes.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By:jrtc27(Jessica Clarke), xgupta(Shivam Gupta)
Differential Revision: https://reviews.llvm.org/D115649
The vp.load and vp.gather intrinsics require the intrinsic return
type to determine the correct function signature. With opaque pointers,
it cannot be derived from the parameter pointee types.
Differential Revision: https://reviews.llvm.org/D115632
Make the reduction handling in OpenMPIRBuilder compatible with
opaque pointers by explicitly storing the element type in ReductionInfo,
and also passing it to the atomic reduction callback, as at least
the ones in the test need the type there.
This doesn't make things fully compatible yet, there are other
uses of element types in this class. I also left one
getPointerElementType() call in mlir, because I'm not familiar
with that area.
Differential Revison: https://reviews.llvm.org/D115638
As reported from a failing firefox build, we can sometimes get frame
indices with negative offsets from a t2LDRi8. This adds support for
them, to prevent the crash.
Instead of printing analysis debug information to stderr, annotate the IR. This makes it easier to understand decisions made by the analysis, especially in larger input IR.
Differential Revision: https://reviews.llvm.org/D115575
Converts concat_vectors(V64 (trunc V128), V64 (trunc V128)), which
would otherwise be lowered as xtn followed by xtn2, to uzp1.
Differential Revision: https://reviews.llvm.org/D115435
Fix a couple of things that were causing stack protection to not work
correctly in functions that have scalable vectors on the stack:
* Use TypeSize when determining if accesses to a variable are
considered out-of-bounds so that the behaviour is correct for
scalable vectors.
* When stack protection is enabled move the stack protector location
to the top of the SVE locals, so that any overflow in them (or the
other locals which are below that) will be detected.
Fixes: https://github.com/llvm/llvm-project/issues/51137
Differential Revision: https://reviews.llvm.org/D111631
This no longer allows creating an invalid Address through the regular
constructor. There were only two places that did this (AggValueSlot
and EHCleanupScope) which did this by converting a potential nullptr
into an Address. I've fixed both of these by directly storing an
Address instead.
This is intended as a bit of preliminary cleanup for D103465.
Differential Revision: https://reviews.llvm.org/D115630
This patch updates expandCALL_RVMARKER to wrap the call, marker and
objc runtime call in an instruction bundle. This ensures later passes,
like machine block placement, cannot break them up.
On AArch64, the instruction sequence is already wrapped in a bundle.
Keeping the whole instruction sequence together is highly desirable for
performance and outweighs potential other benefits from breaking the
sequence up.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D115230
Dependency scanner test for resource directory deduction doesn't account for LLVM builds with custom `CLANG_RESOURCE_DIR`.
This patch ensures we don't hardcode the default behavior into the test and take into account the actual value. This is done by running `%clang -print-resource-dir` and using that as the expected value in test assertions.
New comment also clarifies this is different from running that command as part of the dependency scan.
Reviewed By: mgorny
Differential Revision: https://reviews.llvm.org/D115628