The restriction goes back to:
16f18ed7b5
...but the fold only replaces a shift with a shift, so that's not necessary.
Generalizing to other opcodes is planned as a follow-up.
There are a few places where we use report_fatal_error when the input is broken.
Currently, this function always crashes LLVM with an abort signal, which
then triggers the backtrace printing code.
I think this is excessive, as wrong input shouldn't give a link to
LLVM's github issue URL and tell users to file a bug report.
We shouldn't print a stack trace either.
This patch changes report_fatal_error so it uses exit() rather than
abort() when its argument GenCrashDiag=false.
Reviewed by: nikic, MaskRay, RKSimon
Differential Revision: https://reviews.llvm.org/D126550
No test changes because `err_module_odr_violation_mismatch_decl_unknown`
is a catch-all when custom diagnostic is missing. And missing custom
diagnostic we should fix by implementing it, not by improving the
general case. But if we pass enum value not covered by 'select', clang
can crash, so protect against that.
Differential Revision: https://reviews.llvm.org/D126566
STATEPOINT is a special pseudo instruction which represent Moving GC semantic to LLVM.
Every tied def/use VReg pair in STATEPOINT represent same physical register which can
'magically' change during call wrapped by statepoint.
(By construction, tied use operand is not live across STATEPOINT).
This means that when converting into two-address form, there is not need to insert COPY
instruction before stateppoint, what TwoAddressInstruction pass does for 'regular'
instructions.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D124631
Vector types in hlsl is using clang ext_vector_type.
Declaration of vector types is in builtin header hlsl.h.
hlsl.h will be included by default for hlsl shader.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D125052
Sanitizers ignore flag allocator_may_return_null=1 in strndup() calls.
When OOM is emulated, this causes to the unexpected crash.
Committed by pgousseau on behalf of "Kostyantyn Melnik, kmnls.kmnls@gmail.com"
Reviewed by: pgousseau
Differential Revision: https://reviews.llvm.org/D126452
Python bindings for extensions of the Transform dialect are defined in separate
Python source files that can be imported on-demand, i.e., that are not imported
with the "main" transform dialect. This requires a minor addition to the
ODS-based bindings generator. This approach is consistent with the current
model for downstream projects that are expected to bundle MLIR Python bindings:
such projects can include their custom extensions into the bundle similarly to
how they include their dialects.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D126208
Vectorization is a key transformation to achieve high performance on most
architectures. In the transform dialect, vectorization is implemented as a
parameterizable transform op. It currently applies to a scope of payload IR
delimited by some isolated-from-above op, mainly because several enabling
transformations (such as affine simplification) are needed to perform
vectorization and these transformation would apply to ops other than the "main"
computational payload op. A separate "navigation" transform op that obtains the
isolated-from-above ancestor of an op is introduced in the core transform
dialect. Even though it is currently only useful for vectorization,
isolated-from-above ops are a common anchor for transformations (usually
implemented as passes) that is likely to be reused in the future.
Depends On D126374
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D126542
If only one of the GEPs is inbounds, then after swapping, there is
no guarantee that one of them will be inbounds as well
(see e.g. https://alive2.llvm.org/ce/z/agaCnp).
This is only a partial fix, because even if both are inbounds, the
result is not necessarily inbounds (if the offsets have different
signs).
Without this patch, arguments to the
`llvm::OpenMPIRBuilder::AtomicOpValue` initializer are reversed.
Reviewed By: ABataev, tianshilei1992
Differential Revision: https://reviews.llvm.org/D126619
This patch implements the following semantic check:
A list-item cannot appear in more than one nontemporal clause.
Reviewed By: kiranchandramohan, shraiysh
Differential Revision: https://reviews.llvm.org/D110270
As the long explanatory comment attests, performing the modification
in place is pretty tricky. Drop this unnecessary complexity and
always create new instructions.
This should be NFC-ish, but can probably cause difference due to
worklist order.
The zip files were too large to be practical, so they were never
shipped. Reverting to reduce build time and complexity of the script.
This reverts commit 4486aa03c5.
Test that chunk size is passed to the static init function.
Using three different variations:
1. Single constant.
2. Expression with constants.
3. Variable value.
Reviewed By: peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D126383
VP intrinsics show UB if the %evl parameter is out of bounds - they must
not carry the speculatable attribute. The out-of-bounds UB disappears
when the %evl parameter is expanded into the mask or expansion replaces
the entire VP intrinsic with non-VP code.
This patch
- Removes the speculatable attribute on all VP intrinsics.
- Generalizes the isSafeToSpeculativelyExecute function to let VP
expansion know whether the VP intrinsic replacement will be
speculatable. VP expansion may only discard %evl where this is the
case.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125296
Check in updates based on how the latest release was built [0] and add
the bug fix from [1] which allows LLDB to start.
Other changes which had accumulated in the local release script:
- Don't build the clang format plugin (VS has the functionality built
in now)
- Disable tests that have been failing (I'll try to follow up and
re-enable them)
- Switch to Python 3.10
- Jump through more hoops to make LLDB pick the right Python.
0. https://discourse.llvm.org/t/14-0-4-final-has-been-tagged/62750/3
1. https://github.com/llvm/llvm-project/issues/54589
It appears that float support is complete, or at least, the stackmap records
emitted are not inconceivable (I must admit that I don't know about many of the
architectures under test here).
One curiosity, the SystemZ tests highlight an undocumented (or maybe incorrect)
quirk of the stackmap format: in the case of a Register record, the Offset or
SmallConstant field can encode a sub-register index! I've only ever seen this
field zero for Register entries up until now.
Since the SPIR/SPIR-V targets enable all known features, we must
ensure the Work-group Collective Functions feature macro is set for
OpenCL 3.0.
Fixes https://github.com/llvm/llvm-project/issues/55770
Add ops to the structured transform extension of the transform dialect that
perform interchange, padding and scalarization on structured ops. Along with
tiling that is already defined, this provides a minimal set of transformations
necessary to build vectorizable code for a single structured op.
Define two helper traits: one that implements TransformOpInterface by applying
a function to each payload op independently and another that provides a simple
"functional-style" producer/consumer list of memory effects for the transform
ops.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D126374
znver1 ymm variants of VPMOVSX**/VPMOVZX** instructions require double pumping.
Now matches AMD SoG, Agner and instlatx64 numbers.
Thanks to @fabian-r for the report
This option was added in D89854. It prevents GVN from performing
load PRE in a loop, if doing so would require critical edge
splitting on the backedge. From the review:
> I know that GVN Load PRE negatively impacts peeling,
> loop predication, so the passes expecting that latch has
> a conditional branch.
In the PhaseOrdering test in this patch, splitting the backedge
negatively affects vectorization: After critical edge splitting,
the loop gets rotated, effectively peeling off the first loop
iteration. The effect is that the first element is handled
separately, then the bulk of the elements use a vectorized
reduction (but using unaligned, off-by-one memory accesses) and
then a tail of 15 elements is handled separately again.
It's probably worth noting that the loop load PRE from D99926 is
not affected by this change (as it does not need backedge
splitting). This is about normal load PRE that happens to occur
inside a loop.
Differential Revision: https://reviews.llvm.org/D126382
I don't see a point here in the lit tests here since sqrt, mul and other ops
expand as well. I just added "smoke" tests to verify that the conversion works
and does not create any illegal ops.
I will create a patch that adds a simple integration test to
mlir/test/Integration/Dialect/ComplexOps/ that will compare the values.
Differential Revision: https://reviews.llvm.org/D126539