hasOnlyColdCalls skipped over calls to intrinsics, but it did so after
checking the linkage of the called function. This meant that the presence
of a call to a debug intrinsic could affect the outcome of the
optimization.
In my original reproducer (for an out of tree target) it was particularly
interesting, because the actual IR after GlobalOpt was not different with
debug instrinsics present, so -print-after-all printouts didn't show
anything there.
However, without debuginfo, GlobalOpt went further and ran
BlockFrequencyAnalysis and (more importanly) LoopAnalysis, and later on in
the pipeline, instcombine behaved in different ways when LoopInfo was
present.
So a call to a dbg.declare prevented running LoopAnalysis in
GlobalOpt, which later prevented InstCombine from doing an optimization.
The dbg-intrinsic-loopanalysis.ll testcase tries to expose this.
Then I also noted that adding a dbg.declare actually made the existing
testcase colccc_coldsites.ll generate different code, so I modified that
to now test it behaves the same way with and without the dbg.declare.
Reviewed By: nikic, fhahn
Differential Revision: https://reviews.llvm.org/D133193
Use getPredicateOnEdge method if value is a non-local
compare-with-a-constant instruction, that can give more precise
results than getConstantOnEdge.
Differential Revision: https://reviews.llvm.org/D131956
I'm not sure how much to add to the description as we've tried to allow targets to interpret the TargetCostKind enums in their own way. But we need to make it clear that certain cost kinds need to match threshold numbers used by various passes (and vice-versa when passes are determining a cost-benefit threshold).
I'm not keen on the "The weighted sum of size and latency" description, but its very difficult to come up with anything else that's suitably generic (e.g. X86 will use uop counts here to easily work with LoopMicroOpBufferSize thresholds, even though high latency fdiv/fsqrt instructions still often have low uop counts).
Differential Revision: https://reviews.llvm.org/D132288
This removes a bunch of duplicated code, by adding an intermediate
function simplifyReduction that takes a std::function argument
for the actual replacement of the code.
No functional change intended.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D132588
Running: `mlir-opt -test-vector-warp-distribute=rewrite-warp-ops-to-scf-if -canonicalize -verify-each=0`.
Prior to this revision, IR resembling the following would be produced:
```
%4 = "vector.load"(%3, %arg0) : (memref<1x32xf32, 3>, index) -> vector<1x1xf32>
```
This fails verification since it needs 2 indices to load but only 1 is provided.
Differential Revision: https://reviews.llvm.org/D133106
met this issue when building llvm with config LLVM_LIBDIR_SUFFIX=64, and
the installation destination of scan-build-py does not respect the given
suffix.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D133160
This patch adds contiguity check with the runtime to avoid copyin/copyout
in case the actual argument is actually contiguous.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D133097
Currently, we bail out of scalar promotion if the loop may unwind
and the memory may be visible on unwind. This is because we can't
insert stores of the promoted value on unwind edges.
However, nowadays scalar promotion also has support for only
promoting loads, while leaving stores in place. This kind of
promotion is safe even in the presence of unwinding.
Differential Revision: https://reviews.llvm.org/D133111
For noop store of the form of LoadI and StoreI,
An invariant should be kept is that the memory state of the related
MemoryLoc before LoadI is the same as before StoreI.
For this example:
```
define void @pr49927(i32* %q, i32* %p) {
%v = load i32, i32* %p, align 4
store i32 %v, i32* %q, align 4
store i32 %v, i32* %p, align 4
ret void
}
```
Here the definition of the store's destination is different with the
definition of the load's destination, which it seems that the
invariant mentioned above is broken. But the definition of the
store's destination would write a value that is LoadI, actually, the
invariant is still kept. So we can safely ignore it.
Differential Revision: https://reviews.llvm.org/D132657
Commit 4adc5bead4 moved a dependence on llvm-jitlink from
SANITIZER_COMMON_LIT_TEST_DEPS to ORC_TEST_DEPS, but in doing so it moved it
out from under a 'NOT COMPILER_RT_STANDALONE_BUILD ...' conditional. This led
to failures on standalone builds.
This commit adds the conditional to the ORC_TEST_DEPS assignment to work
around the issue while we look a longer term fix.
rdar://99453446
This patch fixes:
lldb/source/Plugins/Instruction/RISCV/EmulateInstructionRISCV.h:51:5:
error: default label in switch which covers all enumeration values
[-Werror,-Wcovered-switch-default]
When we want to add instrumentation after
an instruction, instrumentation still should
keep debug info of the instruction.
Reviewed By: kda, kstoimenov
Differential Revision: https://reviews.llvm.org/D133091
When we do extractvalue (any_mul_with_overflow X, -1) --> (-X and icmp),
which left partly failed to match vector constant with poison element.
This patch try to fix it.
Alive2: https://alive2.llvm.org/ce/z/2rGp_3
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D132996
Add:
- most of instructions from RVI base instructions set.
- some instruction decode tests from objdump.
Further work:
- implement riscv imac extension.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D132789
Add the coarray intrinsic functions, lcobound and ucobound, to the
list of intrinsics. For both of these functions, add a check to
ensure that if the optional dim argument is present and statically
checkable, its value is in the inclusive range of 1 and the corank
of the coarray argument. In the semantics tests for lcobound and
ucobound, remove the XFAIL directive, add the ERROR directives and
add additional standard-conforming and non-standard conforming
calls.
Reviewed By: klausler, craig.rasmussen
Differential Revision: https://reviews.llvm.org/D126721
arm64e platforms.
On arm64e-capable Apple platforms, the system libraries are always
arm64e, but applications often are arm64. When a target is created
from file, LLDB recognizes it as an arm64 target, but debugserver will
still (technically correct) report the process as being arm64e. For
consistency, set the target to arm64 here.
rdar://92248684
Differential Revision: https://reviews.llvm.org/D133069
Demonstrates how sparse tensor type -> tuple -> getter
will eventually yield actual code on the memrefs directly
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D133143
-fsyntax-only breaks down CUDA compilation pipeline and make it look like
multiple independent subcompilations and that trips the multiple arguments check
when -o is specified.
We do want to allow -fsyntax-only to be used with otherwise unmodified clang
options as it's commonly used by various tooling.
Differential Revision: https://reviews.llvm.org/D133133
This patch adds SparseTensorStorageExpansion pass, it flattens the tuple used to store a sparse
tensor handle.
Right now, it only set up the skeleton for the pass, more lowering rules for sparse tensor storage
operation need to be added.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D133125
Instead of using the IV block argument of the do-loop we will use
the do-variable value loaded from its location. This usage is consistent
with other uses of the do-variable inside the loop.
Differential Revision: https://reviews.llvm.org/D133140
Previously if you specified no_sanitize("known_sanitizer") on a global you
would yield a misleading error "'no_sanitize' attribute only applies to
functions and methods", but no_sanitize("unknown") would simply be a warning,
"unknown sanitizer 'unknown' ignored". This changes the former to a warning
"'no_sanitize' attribute argument not supported for globals: known_sanitizer".
Differential Revision: https://reviews.llvm.org/D133117
Added numerical splat folders for comparison operations and
equal of two identical int values.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D133138
Update the utility functions for checking exceptional values of math
functions to use cpp::optional return values.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D133134
We currently instrument CallBrInst but do not annotate it with
the branch weight. This patch enables PGO annotation of CallBrInst.
Differential Revision: https://reviews.llvm.org/D133040
At least `ntdll` is using the undocumented version 2 unwind info, and opcode 6, which is already defined as `UOP_Epilog`.
Using `llvm-objdump --unwind` with `ntdll` would previously result in unreachable assertions because this code was missing from `getNumUsedSlots` and `getUnwindCodeTypeName`.
The slots of these codes comes from 57bfe47451/src/coreclr/inc/win64unwind.h (L51-L52) which I would assume is a good authoritative source.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107655