If a ComplexType's element type is buildable, then that ComplexType should be
buildable. This is accomplished by the introduction of a new ODS class called
`SameBuildabilityAs`. This can be used by other types that are conditionally
buildable.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D93892
This reverts commit 94427af60c (relands
4646de5d75 with fix).
Use "return std::move(AsmStreamer);" instead of "return AsmStreamer;" in
LVMTargetMachine::createMCStreamer. Unlike Clang, GCC seems having trouble
inserting a implicit lvalue->rvalue conversion.
This change adds '--write-if-changed' flag to llvm-elfabi tool. When
enabled, llvm-elfabi will not overwrite the existing file if the
content of the file will not be changed, which preserves the
timestamp.
Differential Revision: https://reviews.llvm.org/D92902
As of Linux 5.10, the kernel may report either of the two following
crash reasons:
- SEGV_MTEAERR: async MTE tag check fault
- SEGV_MTESERR: sync MTE tag check fault
Teach LLDB about them.
Differential Revision: https://reviews.llvm.org/D93495
This prints OptRemarks at each location where a decision is made to not
outline, or to outline a specific section for the IROutliner pass.
Test:
llvm/test/Transforms/IROutliner/opt-remarks.ll
Reviewers: jroelofs, paquette
Differential Revision: https://reviews.llvm.org/D87300
The switch duplicated the translation in getRecurrenceBinOp().
This code is still weird because it translates to the TTI
ReductionFlags for min/max, but then createSimpleTargetReduction()
converts that back to RecurrenceDescriptor::MinMaxRecurrenceKind.
I'm not sure if the SLP enum was created before the IVDescriptor
RecurrenceDescriptor / RecurrenceKind existed, but the code in
SLP is now redundant with that class, so it just makes things
more complicated to have both. We eventually call LoopUtils
createSimpleTargetReduction() to create reduction ops, so we
might as well standardize on those enum names.
There's still a question of whether we need to use TTI::ReductionFlags
vs. MinMaxRecurrenceKind, but that can be another clean-up step.
Another option would just be to flatten the enums in RecurrenceDescriptor
into a single enum. There isn't much benefit (smaller switches?) to
having a min/max subset.
This complements the existing RVV ISel patterns for arithmetic, bitwise
and shifts with the remaining operations in those categories: sub, and,
xor, sra.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D93852
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93793
This adds a cost model that takes into account the total number of
machine instructions to be removed from each region, the number of
instructions added by adding a new function with a set of instructions,
and the instructions added by handling arguments.
Tests not adding flags:
llvm/test/Transforms/IROutliner/outlining-cost-model.ll
Reviewers: jroelofs, paquette
Differential Revision: https://reviews.llvm.org/D87299
If the destination is tied, then user has some control of the
register used for input. They would have the ability to control
the value of any tail elements. By using tail agnostic we take
this option away from them.
Its not clear that the intrinsics are defined such that this isn't
supposed to work. And undisturbed is a valid implementation for agnostic
so code wouldn't even fail to work on all systems if we always used
agnostic.
The vcompress intrinsic is defined to require tail undisturbed so
at minimum we need this for that instruction or need to redefine
the intrinsic.
I've made an exception here for vmv.s.x/fmv.s.f and reduction
instructions which only write to element 0 regardless of the tail
policy. This allows us to keep the agnostic policy on those which
should allow better redundant vsetvli removal.
An enhancement would be to check for undef input and keep the
agnostic policy, but we don't have good test coverage for that yet.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D93878
And add it to the AMDGPU opt pipeline.
This is a function pass instead of a module pass (like the legacy pass)
because it's getting added to a CGSCCPassManager, and you can't put a
module pass in a CGSCCPassManager.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D93885
The spec for these instructions include this note. "The destination register
cannot overlap either the source register or the mask register ('v0') if the
instruction is masked." So we need earlyclobber to enforce this constraint.
I've regenerated the tests with update_llc_test_checks.py to show the
effects of the earlyclobber.
Reviewed By: khchen, frasercrmck
Differential Revision: https://reviews.llvm.org/D93867
As Mikael Holmén is noting in the post-commit review for the first fix
https://reviews.llvm.org/rGd4ccef38d0bb#967466
not hoisting constantexprs is not enough,
because if the xor originally was a constantexpr (i.e. X is a constantexpr).
`SimplifyAssociativeOrCommutative()` in `visitXor()` will immediately
undo this transform, thus again causing an infinite combine loop.
This transform has resulted in a surprising number of constantexpr failures.
We will emit these permuted nodes on all VSX little endian subtargets
but don't have the patterns available to match them on subtargets
that don't have direct moves.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=47916
On subtargets prior to Power9, conversions to/from half precision
are lowered to libcalls. This makes loops containing such operations
invalid candidates for HW loops.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=48519
This patch upstreams support for the Armv8-a Cortex-A78C
processor for AArch64 and ARM.
In detail:
Adding cortex-a78c as cpu option for aarch64 and arm targets in clang
Adding Cortex-A78C CPU name and ProcessorModel in llvm
Details of the CPU can be found here:
https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78c
This commit copies existing tests at llvm/Transforms containing
'shufflevector X, undef' and replaces them with 'shufflevector X, poison'.
The new copied tests have *-inseltpoison.ll suffix at its file name
(as db7a2f347f did)
See https://reviews.llvm.org/D93793
Test files listed using
grep -R -E "^[^;]*shufflevector <.*> .*, <.*> undef" | cut -d":" -f1 | uniq
Test files copied & updated using
file_org=llvm/test/Transforms/$1
if [[ "$file_org" = *-inseltpoison.ll ]]; then
file=$file_org
else
file=${file_org%.ll}-inseltpoison.ll
if [ ! -f $file ]; then
cp $file_org $file
fi
fi
sed -i -E 's/^([^;]*)shufflevector <(.*)> (.*), <(.*)> undef/\1shufflevector <\2> \3, <\4> poison/g' $file
head -1 $file | grep "Assertions have been autogenerated by utils/update_test_checks.py" -q
if [ "$?" == 1 ]; then
echo "$file : should be manually updated"
# The test is manually updated
exit 1
fi
python3 ./llvm/utils/update_test_checks.py --opt-binary=./build-releaseassert/bin/opt $file
We can reduce the number of "using" declarations.
`LLVM_ELF_IMPORT_TYPES_ELFT` was extended in D93801.
Differential revision: https://reviews.llvm.org/D93856