Originally, `OptLevel` isn't passed into the `MachineFunctionPass`.
This lets the default parameter of `SelectionDAGISel`, which is
`CodeGenOpt::Default`, be passed in. OptLevelChanger captures the
optimization level with the parameter, and rather not the value
within `TargetMachine`. This lets the optimization be
unintentionally overwriten if other value than `CodeGenOpt::Default`
passed.
This patch fixes this by passing the optimization level rather
than using the default value.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D126641
The optimization level should not be restored into O2.
This is a pre-commit test case to show fix in D126641.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D126677
GCC recently started setting constructor priority on init_have_lse_atomics [1]
to avoid undefined initialization order with respect to other initializers,
causing accidental use of ll/sc intrinsics on targets where this was not
intended (which presents a minor performance problem as well as a
compatibility problem for users wanting to use the rr debugger). I initially
thought compiler-rt does not have the same issue as libgcc, since it looks
like we're already setting init priority on the constructor.
Unfortuantely, it does not appear that the HAVE_INIT_PRIORITY check is ever
performed anyway, so despite appearances the init priority was not actually
applied. Fix that by applying the init priority unconditionally. It has been
supported in clang ever since it was first introduced and in any case for
more than 14 years in both gcc and clang. MSVC is already excluded from this
code path and we're already using constructors with init priority elsewhere
in compiler-rt without additional check (though mostly in the sanitizer
runtime, which may have more narrow target support). Regardless, I believe
that for our supported compilers, if they support the constructor attribute,
they should also support init priorities.
While we're here, change the init priority from 101, which is the highest
priority for end user applications, to instead use one of the priority levels
reserved for implementations (1-100; lower integers are higher priority).
GCC ended up using `90`, so this commit aligns the value in compiler-rt
to the same value to ensure that there are no subtle initialization order
differences between libgcc and compiler-rt.
[1] 75c4e4909a
Differential Revision: https://reviews.llvm.org/D126424
Such vector types cannot be split at the moment, because splitting expects
an even number of elements in the source type. If a target marks such a type
as "split", TargetLoweringBase::computeRegisterProperties will override it
with widening to the next power of 2. This could lead to issues during
instruction selection when conflicting information about how to handle this
type is present.
D126308 broke building clang standalone, as LLVM_UTILS_INSTALL_DIR is
not exported.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D126671
(C2 >> X) >> C1 --> (C2 >> C1) >> X
The shift-left form of this transform has existed since:
16f18ed7b5
...but it applies to matching shift right opcodes too:
https://alive2.llvm.org/ce/z/c5eQms
The restriction goes back to:
16f18ed7b5
...but the fold only replaces a shift with a shift, so that's not necessary.
Generalizing to other opcodes is planned as a follow-up.
There are a few places where we use report_fatal_error when the input is broken.
Currently, this function always crashes LLVM with an abort signal, which
then triggers the backtrace printing code.
I think this is excessive, as wrong input shouldn't give a link to
LLVM's github issue URL and tell users to file a bug report.
We shouldn't print a stack trace either.
This patch changes report_fatal_error so it uses exit() rather than
abort() when its argument GenCrashDiag=false.
Reviewed by: nikic, MaskRay, RKSimon
Differential Revision: https://reviews.llvm.org/D126550
No test changes because `err_module_odr_violation_mismatch_decl_unknown`
is a catch-all when custom diagnostic is missing. And missing custom
diagnostic we should fix by implementing it, not by improving the
general case. But if we pass enum value not covered by 'select', clang
can crash, so protect against that.
Differential Revision: https://reviews.llvm.org/D126566
STATEPOINT is a special pseudo instruction which represent Moving GC semantic to LLVM.
Every tied def/use VReg pair in STATEPOINT represent same physical register which can
'magically' change during call wrapped by statepoint.
(By construction, tied use operand is not live across STATEPOINT).
This means that when converting into two-address form, there is not need to insert COPY
instruction before stateppoint, what TwoAddressInstruction pass does for 'regular'
instructions.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D124631
Vector types in hlsl is using clang ext_vector_type.
Declaration of vector types is in builtin header hlsl.h.
hlsl.h will be included by default for hlsl shader.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D125052
Sanitizers ignore flag allocator_may_return_null=1 in strndup() calls.
When OOM is emulated, this causes to the unexpected crash.
Committed by pgousseau on behalf of "Kostyantyn Melnik, kmnls.kmnls@gmail.com"
Reviewed by: pgousseau
Differential Revision: https://reviews.llvm.org/D126452
Python bindings for extensions of the Transform dialect are defined in separate
Python source files that can be imported on-demand, i.e., that are not imported
with the "main" transform dialect. This requires a minor addition to the
ODS-based bindings generator. This approach is consistent with the current
model for downstream projects that are expected to bundle MLIR Python bindings:
such projects can include their custom extensions into the bundle similarly to
how they include their dialects.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D126208
Vectorization is a key transformation to achieve high performance on most
architectures. In the transform dialect, vectorization is implemented as a
parameterizable transform op. It currently applies to a scope of payload IR
delimited by some isolated-from-above op, mainly because several enabling
transformations (such as affine simplification) are needed to perform
vectorization and these transformation would apply to ops other than the "main"
computational payload op. A separate "navigation" transform op that obtains the
isolated-from-above ancestor of an op is introduced in the core transform
dialect. Even though it is currently only useful for vectorization,
isolated-from-above ops are a common anchor for transformations (usually
implemented as passes) that is likely to be reused in the future.
Depends On D126374
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D126542
If only one of the GEPs is inbounds, then after swapping, there is
no guarantee that one of them will be inbounds as well
(see e.g. https://alive2.llvm.org/ce/z/agaCnp).
This is only a partial fix, because even if both are inbounds, the
result is not necessarily inbounds (if the offsets have different
signs).
Without this patch, arguments to the
`llvm::OpenMPIRBuilder::AtomicOpValue` initializer are reversed.
Reviewed By: ABataev, tianshilei1992
Differential Revision: https://reviews.llvm.org/D126619
This patch implements the following semantic check:
A list-item cannot appear in more than one nontemporal clause.
Reviewed By: kiranchandramohan, shraiysh
Differential Revision: https://reviews.llvm.org/D110270
As the long explanatory comment attests, performing the modification
in place is pretty tricky. Drop this unnecessary complexity and
always create new instructions.
This should be NFC-ish, but can probably cause difference due to
worklist order.
The zip files were too large to be practical, so they were never
shipped. Reverting to reduce build time and complexity of the script.
This reverts commit 4486aa03c5.
Test that chunk size is passed to the static init function.
Using three different variations:
1. Single constant.
2. Expression with constants.
3. Variable value.
Reviewed By: peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D126383
VP intrinsics show UB if the %evl parameter is out of bounds - they must
not carry the speculatable attribute. The out-of-bounds UB disappears
when the %evl parameter is expanded into the mask or expansion replaces
the entire VP intrinsic with non-VP code.
This patch
- Removes the speculatable attribute on all VP intrinsics.
- Generalizes the isSafeToSpeculativelyExecute function to let VP
expansion know whether the VP intrinsic replacement will be
speculatable. VP expansion may only discard %evl where this is the
case.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D125296
Check in updates based on how the latest release was built [0] and add
the bug fix from [1] which allows LLDB to start.
Other changes which had accumulated in the local release script:
- Don't build the clang format plugin (VS has the functionality built
in now)
- Disable tests that have been failing (I'll try to follow up and
re-enable them)
- Switch to Python 3.10
- Jump through more hoops to make LLDB pick the right Python.
0. https://discourse.llvm.org/t/14-0-4-final-has-been-tagged/62750/3
1. https://github.com/llvm/llvm-project/issues/54589
It appears that float support is complete, or at least, the stackmap records
emitted are not inconceivable (I must admit that I don't know about many of the
architectures under test here).
One curiosity, the SystemZ tests highlight an undocumented (or maybe incorrect)
quirk of the stackmap format: in the case of a Register record, the Offset or
SmallConstant field can encode a sub-register index! I've only ever seen this
field zero for Register entries up until now.
Since the SPIR/SPIR-V targets enable all known features, we must
ensure the Work-group Collective Functions feature macro is set for
OpenCL 3.0.
Fixes https://github.com/llvm/llvm-project/issues/55770