The test failed since commit
bc10888dc "DomTree: Make PostDomTree indifferent to block successors swap"
which is a re-commit of
c35585e20 "DomTree: Make PostDomTree immune to block successors swap"
If we can't identify alloca used in lifetime marker we
need to assume to worst case scenario.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D84630
This checker appears to be intentionally not diagnosing cases where an
operator appearing in a duplicated expression might have side-effects;
Clang is now modeling fold-expressions as having an unresolved operator
name within them, so they now trip up this check.
addGlobalValueSummary can check newly added FunctionSummary
and set HasParamAccess to mark that generateParamAccessSummary
is needed.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D85182
Change to not generate truncate instructions if all use of a truncate
operation don't care about higher bits. For example, an i32 add
instruction doesn't care about higher 32 bits in 64 bit registers.
Updates regression tests also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D85418
In GlobalISel, if you have a load into a small type with a range, you'll hit
an assert if you try to compute known bits on it starting at a larger type.
e.g.
```
%x:_(s8) = G_LOAD %whatever(p0) :: (load 1 ... !range !n)
...
%y:_(s32) = G_SOMETHING %x
```
When we walk through G_SOMETHING and hit the load, the width of our known bits
is 32. However, the width of the range is going to be 8. This will cause us
to hit an assert.
To fix this, make computeKnownBitsFromRangeMetadata zero extend or truncate
the range type to match the bitwidth of the known bits we're calculating.
Add a testcase in CodeGen/GlobalISel/KnownBitsTest.cpp to reflect that this
works now.
https://reviews.llvm.org/D85375
The implementation of these classes was copied & pasted from the
iPhone simulator plugin with only a handful of configuration
parameters substituted. This patch moves the redundant implementations
into the base class PlatformAppleSimulator.
Differential Revision: https://reviews.llvm.org/D85243
As of GN 3028c6a426a4, the hack that transformed "libs" ending in
".framework" from -l arguments to -framework arguments has been removed.
Instead, "frameworks" must be used, and the toolchain must provide
support.
Differential Revision: https://reviews.llvm.org/D84219
tl;dr See D81784 for the 'tombstone value' concept. This patch changes our behavior to be almost the same as GNU ld (except that we also use 1 for .debug_loc):
* .debug_ranges & .debug_loc: 1 (LLD<11: 0+addend; GNU ld uses 1 for .debug_ranges)
* .debug_*: 0 (LLD<11: 0+addend; GNU ld uses 0; future LLD: -1)
We make the tweaks because:
1) The new tombstone is novel and needs more time to be adopted by consumers before it's the default.
2) The old (gold) strategy had problems with zero-length functions - so rather than going back that, we're going to the GNU ld strategy which doesn't have that problem.
3) One slight tweak to (2) is to apply the .debug_ranges workaround to .debug_loc for the same reasons it applies to debug_ranges - to avoid terminating lists early.
-----
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143482.html
The tombstone value -1 in .debug_line caused problems to lldb (fixed by D83957;
will be included in 11.0.0) and breakpad (fixed by
https://crrev.com/c/2321300). It may potentially affects other DWARF consumers.
For .debug_ranges & .debug_loc: 1, an argument preferring 1 (GNU ld for .debug_ranges) over -2 is that:
```
{-1, -2} <<< base address selection entry
{0, length} <<< address range
```
may create a situation where low_pc is greater than high_pc. So we use
1, the GNU ld behavior for .debug_ranges
For other .debug_* sections, there haven't been many reports. One issue is that
bloaty (src/dwarf.cc) can incorrectly count address ranges in .debug_ranges . To
reduce similar disruption, this patch changes the tombstone values to be similar to GNU ld.
This does mean another behavior change to the default trunk behavior. Sorry
about it. The default trunk behavior will be similar to release/11.x while we work on a transition plan for LLD users.
Reviewed By: dblaikie, echristo
Differential Revision: https://reviews.llvm.org/D84825
Buildbot reported a build failure when building shared
library libLLVMBPFCodeGen.so with unknown reference to
"createCFGSimplificationPass".
Commit 87cba43402 ("BPF: add a SimplifyCFG IR pass during
generic Scalar/IPO optimization") added an IR pass SimplifyCFG
by BPF target. The commit called function
createCFGSimplificationPass() defined in "Scalar" library.
Add this library in Target/BPF/LLVMBuild.txt so
shared library build can succeed.
Negator knows how to do this, but the one-use reasoning is getting
a bit muddy here, we don't really want to increase instruction count,
so we need to both lie that "IsNegation" and have an one-use check
on the outermost LHS value.
Multiplication is commutative, and either of operands can be negative,
so if the RHS is a negated power-of-two, we should try to make it
true power-of-two (which will allow us to turn it into a left-shift),
by trying to sink the negation down into LHS op.
But, we shouldn't re-invent the logic for sinking negation,
let's just use Negator for that.
Tests and original patch by: Simon Pilgrim @RKSimon!
Differential Revision: https://reviews.llvm.org/D85446
with how it is done for a lean binary
In particular this affects how target create --arch is handled — it
allowed us to override the deployment target (a useful feature for the
expression evaluator), but the fat binary case didn't.
rdar://problem/66024437
Differential Revision: https://reviews.llvm.org/D85049
(cherry picked from commit 470bdd3caaab0b6e0ffed4da304244be40b78668)
This warning diagnoses cases where an expression is compared to a
constant, and the comparison is tautological due to the form of the
expression (but not merely due to its type). This applies in cases such
as comparisons of bit-fields and the result of bit-masks.
The new warning is added to the Clang diagnostic group
-Wtautological-constant-in-range-compare but not to the
formerly-equivalent GCC-compatibility diagnostic group -Wtype-limits,
which retains its old meaning of diagnosing only tautological
comparisons to extremal values of a type (eg, int > INT_MAX).
Reviewed By: rtrieu
Differential Revision: https://reviews.llvm.org/D85256
One of the callers only wants the condition, but the vselect can
be simplified by getNode making it hard or impossible to retrieve
the condition.
Instead, return the condition and make the other 2 callers
responsible for creating the vselect node using the condition.
Rename the function to WidenVSELECTMask accordingly.
Differential Revision: https://reviews.llvm.org/D85468
We had a conversion from const char * to StringRef and const char *
to std::string conversion. These both do their own
strlen call if the compiler doens't figure out how to share them.
By adding the temporary StringRef we can convert it to std::string
instead.
The other case is to use a StringSwitch<StringRef> instead of
StringSwitch<const char *> since the output values of the switch
are string literals. This allows the length to be computed at
compile time. Otherwise we have to convert from const char *
to std::string after the StringSwitch.
I believe this function used to be called directly from X86
specific code and was used to immediately create -target-cpu
command line. A later refactoring changed it to to be called from
a generic getCPU function that returns std::string. So on some
paths we created a string using MakeArgString converted that to
std::string then called MakeArgString again from that.
Instead just return std::string directly like the other targets.
The following bpf linux kernel selftest failed with latest
llvm:
$ ./test_progs -n 7/10
...
The sequence of 8193 jumps is too complex.
verification time 126272 usec
stack depth 320
processed 114799 insns (limit 1000000)
...
libbpf: failed to load object 'pyperf600_nounroll.o'
test_bpf_verif_scale:FAIL:110
#7/10 pyperf600_nounroll.o:FAIL
#7 bpf_verif_scale:FAIL
After some investigation, I found the following llvm patch
https://reviews.llvm.org/D84108
is responsible. The patch disabled hoisting common instructions
in SimplifyCFG by default. Later on, the code changes and a
SimplifyCFG phase with hoisting on cannot do the work any more.
A test is provided to demonstrate the problem.
The IR before simplifyCFG looks like:
for.cond:
%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
%cmp = icmp ult i32 %i.0, 6
br i1 %cmp, label %for.body, label %for.cond.cleanup
for.cond.cleanup:
%2 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
%cmp2 = icmp eq i8* %2, null
%conv = zext i1 %cmp2 to i32
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %1) #3
call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %0) #3
ret i32 %conv
for.body:
%3 = load i8*, i8** %frame_ptr, align 8, !tbaa !2
%tobool.not = icmp eq i8* %3, null
br i1 %tobool.not, label %for.inc, label %land.lhs.true
The first two insns of `for.cond.cleanup` and `for.body`, load and
icmp, can be hoisted to `for.cond` block. With Patch D84108, the
optimization is delayed. But unfortunately, later on loop rotation
added addition phi nodes to `for.body` and hoisting cannot
be done any more.
Note such a hoisting is beneficial to bpf programs as
bpf verifier does path sensitive analysis and verification.
The hoisting preverts reloading from stack which will assume
conservative value and increase exploited insns. In this case,
it caused verifier failure.
To fix this problem, I added an IR pass from bpf target
to performance additional simplifycfg with hoisting common inst
enabled.
Differential Revision: https://reviews.llvm.org/D85434
Add address space to indirect abi info and use it for kernels.
Previously, indirect arguments assumed assumed a stack passed object
in the alloca address space using byval. A stack pointer is unsuitable
for kernel arguments, which are passed in a separate, constant buffer
with a different address space.
Start using the new byref for aggregate kernel arguments. Previously
these were emitted as raw struct arguments, and turned into loads in
the backend. These will lower identically, although with byref you now
have the option of applying an explicit alignment. In the future, a
reasonable implementation would use byref for all kernel arguments
(this would be a practical problem at the moment due to losing things
like noalias on pointer arguments).
This is mostly to avoid fighting the optimizer's treatment of
aggregate load/store. SROA and instcombine both turn aggregate loads
and stores into a long sequence of element loads and stores, rather
than the optimizable memcpy I would expect in this situation. Now an
explicit memcpy will be introduced up-front which is better understood
and helps eliminate the alloca in more situations.
This skips using byref in the case where HIP kernel pointer arguments
in structs are promoted to global pointers. At minimum an additional
patch is needed to allow coercion with indirect arguments. This also
skips using it for OpenCL due to the current workaround used to
support kernels calling kernels. Distinct function bodies would need
to be generated up front instead of emitting an illegal call.
The code in ObjectFileMachO didn't disambiguate between ios and
ios-simulator object files for Mach-O objects using the legacy
ambiguous LC_VERSION_MIN load commands. This used to not matter before
taught ArchSpec that ios and ios-simulator are no longer compatible.
rdar://problem/66545307
Differential Revision: https://reviews.llvm.org/D85358
Summary:
This patch takes the indices operands of `insertelement`/`insertvalue`
into account while generation of seed elements for `findBuildAggregate()`.
This function has kept the original order of `insert`s before.
Also this patch optimizes `findBuildAggregate()` preventing it from
redundant temporary vector allocations and its multiple reversing.
Fixes llvm.org/pr44067
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83779
This change fixes errors reported by Control Flow Integrity (CFI) checking when using `std::packaged_task`. The errors mostly stem from casting the underlying storage (`__buf_`) to `__base*`, even if it is uninitialized. The solution is to wrap `__base*` access to `__buf_` behind a getter marked with _LIBCPP_NO_CFI.
Differential Revision: https://reviews.llvm.org/D82627
Inside clangd, clang-tidy checks don't see preprocessor events in the preamble.
This leads to `Token::PtrData == nullptr` for tokens that the macro is defined to.
E.g. `#define SIGTERM 15`:
- Token::Kind == tok::numeric_constant (Token::isLiteral() == true)
- Token::UintData == 2
- Token::PtrData == nullptr
As the result of this, bugprone-bad-signal-to-kill-thread check crashes at null-dereference inside clangd.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D85417
Add cases of fused fmul+fadd/fsub with f16 and f64 operands to cost model.
Also added operations with contract attribute.
Fixed line endings in test.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D84995