Changes VPReplicateRecipe to extract the last lane from an unconditional,
uniform store instruction. collectLoopUniforms will also add stores to
the list of uniform instructions where Legal->isUniformMemOp is true.
setCostBasedWideningDecision now sets the widening decision for
all uniform memory ops to Scalarize, where previously GatherScatter
may have been chosen for scalable stores.
This fixes an assert ("Cannot yet scalarize uniform stores") in
setCostBasedWideningDecision when we have a loop containing a
uniform i1 store and a scalable VF, which we cannot create a scatter for.
Reviewed By: sdesmalen, david-arm, fhahn
Differential Revision: https://reviews.llvm.org/D112725
Add conversion pattern for the `fir.convert` operation.
This patch is part of the upstreaming effort from fir-dev branch.
This patch was previously landed with a truncated version that
was failing the windows buildbot.
Reviewed By: rovka, awarzynski
Differential Revision: https://reviews.llvm.org/D113469
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
This patch fixes an amusing bug where a Platform::Kill operation would
happily terminate a proces on a completely different platform, as long
as they have the same process ID. This was due to the fact that the
implementation was iterating through all known (debugged) processes in
order terminate them directly.
This patch just deletes that logic, and makes everything go through the
OS process termination APIs. While it would be possible to fix the logic
to check for a platform match, it seemed to me that the implementation
was being too smart for its own good -- accessing random Process
objects without knowing anything about their state is risky at best.
Going through the os ensures we avoid any races.
I also "upgrade" the termination signal to a SIGKILL to ensure the
process really dies after this operation.
Differential Revision: https://reviews.llvm.org/D113184
The alpha.security.cert section came right after alpha.security, making it look
like checkers like alpha.security.MmapWriteExec belonged to that package.
Differential Revision: https://reviews.llvm.org/D113397
Add conversion pattern for the `fir.convert` operation.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: rovka, awarzynski
Differential Revision: https://reviews.llvm.org/D113469
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
(Cond & C) | (~bitcast(Cond) & D) --> bitcast (select Cond, (bc C), (bc D))
This is part of fixing:
https://llvm.org/PR34047
That report shows a case where a bitcast is sitting between the select condition
candidate and its 'not' value due to current cast canonicalization rules.
There's a bitcast type restriction that might be violated in existing matching,
but I still need to investigate if that is possible -
Alive2 shows we can only do this transform safely when the bitcast is from
narrow to wide vector elements (otherwise poison could leak into elements
that were safe in the original code):
https://alive2.llvm.org/ce/z/Hf66qh
Differential Revision: https://reviews.llvm.org/D113035
Added a lit test that checks scev-based salvagaing does not select IVs
that have a SCEV containing an undef.
Original Differential Revision: https://reviews.llvm.org/D111810
This patch add conversion for primitive operations on complex types.
- fir.addc
- fir.subc
- fir.mulc
- fir.divc
- fir.negc
This adds also the type conversion for !fir.complex<KIND> type.
This patch is part of the upstreaming effort from fir-dev branch.
This patch was updated to avoid failure on windows buildbot.
Flang codegen does not support windows target so we force the test
to use a known target instead.
Reviewed By: kiranchandramohan, rovka
Differential Revision: https://reviews.llvm.org/D113434
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
attempts
Prevent the selection of IVs that have a SCEV containing an undef. Also
prevent salvaging attempts for values for which a SCEV could not be
created by ScalarEvolution and have only SCEVUknown.
Reviewed by: Orlando
Differential Revision: https://reviews.llvm.org/D111810
Without this patch, passingValueIsAlwaysUndefined will iterate over all
instructions from I to the end of the basic block, even if the use is
outside the block.
This patch adds an early bail out, if the use instruction is outside I's
BB. This can greatly reduce compile-time in cases where very large basic
blocks are involved, with a large number of PHI nodes and incoming
values.
Note that the refactoring makes the handling of the case where I is a
phi and Use is in PHI more explicit as well: for phi nodes, we can also
directly bail out. In the existing code, we would iterate until we reach
the end and return false.
Based on an earlier patch by Matt Wala.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D113293
This patch adds DUP+FMUL => FMUL_indexed pattern to InstCombiner.
FMUL_indexed is normally selected during instruction selection, but it
does not work in cases when VDUP and VMUL are in different basic
blocks.
Differential Revision: https://reviews.llvm.org/D99662
Calling clang-tidy on ClangTidyDiagnosticConsumer.cpp gives a
"unmatched NOLINTBEGIN without a subsequent NOLINTEND" warning.
The "NOLINTBEGIN" and "NOLINTEND" string literals used in the
implementation of `createNolintError()` get mistaken for actual
NOLINTBEGIN/END comments used to suppress clang-tidy warnings.
Rewrite the string literals so that they can no longer be mistaken for
actual suppression comments.
Differential Revision: https://reviews.llvm.org/D113472
On Armv6-M the branch may not able to reach the _Unwind_Resume function because it's relocation(R_ARM_THM_JUMP11) is in -2048, 2047 range only.
Reviewed By: chill, stuij, lenary
Differential Revision: https://reviews.llvm.org/D113181
This patch merges FoldConstantVectorArithmetic back into FoldConstantArithmetic.
Like FoldConstantVectorArithmetic we now handle vector ops with any operand count, but we currently still only handle binops for scalar types - this can be improved in future patches - in particular some common unary/trinary ops still have poor constant folding.
There's one change in functionality causing test changes - FoldConstantVectorArithmetic bails early if the build/splat vector isn't all constant (with some undefs) elements, but FoldConstantArithmetic doesn't - it instead attempts to fold the scalar nodes and bails if they fail to regenerate a constant/undef result, allowing some additional identity/undef patterns to be handled.
Differential Revision: https://reviews.llvm.org/D113300
It is often useful to know which die is the parent of the current die.
This patch adds information about parent offset into the dump:
0x0000000b: DW_TAG_compile_unit
DW_AT_producer ("by_hand")
0x00000014: DW_TAG_base_type (0x0000000b) <<<<<<<<<<<<<<
DW_AT_name ("int")
Now it is easy to see which die is the parent of the current die.
This patch makes that behaviour to be default.
We can make it to be opt-in if neccessary.
This functionality differs from already existed "--show-parents"
in that sence that parent information is shown for all dies and
only link to the immediate parent is shown.
Differential Revision: https://reviews.llvm.org/D113406
It is trivial to produce DemandedSrcElts given DemandedReplicatedElts,
so don't pass the former. Also, it isn't really useful so far
to have the overload taking the Mask, so just inline it.
This reapplies patch db289340c8.
The test failures on build with expensive checks caused by the patch happened due
to the fact that we sorted loop Phis in replaceCongruentIVs using llvm::sort,
which shuffles the given container if the expensive checks are enabled,
so equivalent Phis in the sorted vector had different mutual order from run
to run. replaceCongruentIVs tries to replace narrow Phis with truncations
of wide ones. In some test cases there were several Phis with the same
width, so if their order differs from run to run, the narrow Phis would
be replaced with a different Phi, depending on the shuffling result.
The patch ae14fae0ff fixed this issue by
replacing llvm::sort with llvm::stable_sort.
This patch adds a function to verify general properties of VPlans. The
first check makes sure that all phi-like recipes are at the beginning of
a block, with no other recipes in between.
Note that this currently may not hold for VPBlendRecipes at the moment,
as other recipes may be inserted before the VPBlendRecipe during mask
creation.
Note that this patch depends on D111300 and D111301, which fix code that
breaks the checked invariant.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D111302
This patch add conversion for primitive operations on complex types.
- fir.addc
- fir.subc
- fir.mulc
- fir.divc
- fir.negc
This adds also the type conversion for !fir.complex<KIND> type.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: rovka
Differential Revision: https://reviews.llvm.org/D113434
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Applying the same rules as for LLVM_BUILD_INSTRUMENTED build in the cmake files.
By having this patch, we are able to disable/enable instrument+coverage build
of the compiler-rt project when building instrumented LLVM.
Differential Revision: https://reviews.llvm.org/D108127
Summary:
This patch adds yaml2obj supporting for the auxiliary
file header of XCOFF.
Reviewed By: DiggerLin, jhenderson
Differential Revision: https://reviews.llvm.org/D111487
This is a fix for test failures on expensive checks build caused by db289340c8.
With LLVM_ENABLE_EXPENSIVE_CHECKS enabled the llvm::sort shuffles the given container.
However, the sort is only called when the TTI is passed to replaceCongruentIVs.
In the mentioned patch we pass it TTI, so the sort happens. But due to shuffling
equivalent Phis may appear in different order from run to run.
With the stable_sort instead of sort this is impossible - the order of sorted Phis
is preserved.
Rewrite function signatures and calls to functions that accept or return
COMPLEX values.
Also teach insert_value and extract_value about the MLIR ComplexType, by
adding AnyComplex to AnyCompositeLike.
This patch is part of the effort for upstreaming the fir-dev branch.
Differential Revision: https://reviews.llvm.org/D113273
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Tim Keith <tkeith@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
This fixes an assertion failure with -early-live-intervals when trying
to update the live intervals for a debug instruction, which don't even
have slot indexes.
Differential Revision: https://reviews.llvm.org/D113116
This patch factors out division representation computation from upper-lower bound
inequalities to a separate function. This is done to improve readability and reuse.
This patch is marked NFC since the only change is factoring out existing code
to a separate function.
Reviewed By: grosser
Differential Revision: https://reviews.llvm.org/D113463
Clang builtin utility `__remove_address_space` now works if generic
address space is not supported in C++ for OpenCL 2021.
Differential Revision: https://reviews.llvm.org/D110155
This patch adds the basic infrastructure for the TargetRewrite pass,
which rewrites certain FIR dialect operations into target specific
forms. In particular, it converts boxchar function parameters, call
arguments and return values. Other convertions will be included in
future patches.
This patch is part of the effort for upstreaming the fir-dev branch.
Differential Revision: https://reviews.llvm.org/D112910
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Tim Keith <tkeith@nvidia.com>
The existing CGOpenMPRuntimeAMDGCN and CGOpenMPRuntimeNVPTX classes are
just code bloat. By removing them, the codebase gets a bit cleaner.
Reviewed By: jdoerfert, JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D113421
operand bundle "clang.arc.attachedcall" with ObjC runtime functions
The existing code only handles the case where the intrinsic being
rewritten is used as the called function pointer of a call/invoke.