Summary:
Without this patch, the jump threading pass ignores profiling data
whenever we invoke the pass with the new pass manager.
Specifically, JumpThreadingPass::run calls runImpl with class variable
HasProfileData always set to false. In turn, runImpl sets
HasProfileData to false again:
HasProfileData = HasProfileData_;
In the end, we don't use profiling data at all with the new pass
manager.
This patch fixes the problem by passing F.hasProfileData() to runImpl.
The bug appears to have been introduced at:
https://reviews.llvm.org/D41461
which removed local variable HasProfileData in JumpThreadingPass::run
even though there was one more use left in the same function. As a
result, the remaining use ended referring to the class variable
instead.
Note that F.hasProfileData is an extremely lightweight function, so I
don't see the need to cache its result. Once this patch is approved,
I'm planning to stop caching the result of F.hasProfileData in
runOnFunction.
Reviewers: wmi, eli.friedman
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70509
Commit a0841dfe85 ("[BPF] Fix a bug in peephole optimization")
fixed a bug in peephole optimization. Recursion is introduced
to handle COPY and PHI instructions.
Unfortunately, multiple PHI instructions may form a cycle
and this will cause infinite recursion, eventual segfault.
For Commit a0841dfe85, I indeed tried a few loops to ensure
that I won't see the recursion, but I did not try with
complex control flows, which, as demonstrated with the test case
in this patch, may introduce PHI cycles.
This patch fixed the issue by introducing a set to remember
visited PHI instructions. This way, cycles can be properly
detected and handled.
Differential Revision: https://reviews.llvm.org/D70586
Summary:
This patch is a follow up on read-only assembly patch D70182.
It intends to enable object file generation for the read-only data section on AIX.
Reviewers: DiggerLin, daltenty
Differential Revision: https://reviews.llvm.org/D70455
Summary:
If -resource-dir is not specified as part of the compilation command, then by default
clang-scan-deps picks up a directory relative to its own path as resource-directory.
This is probably not the right behavior - since resource directory should be picked relative
to the path of the clang-compiler in the compilation command.
This patch adds support for it along with a cache to store the resource-dir paths based on
compiler paths.
Notes:
1. "-resource-dir" is a behavior that's specific to clang, gcc does not have that flag. That's why if I'm not able to find a resource-dir, I quietly ignore it.
2. Should I also use the mtime of the compiler in the cache? I think its not strictly necessary since we assume the filesystem is immutable.
3. From my testing, this does not regress performance.
4. Will try to get this tested on Windows.
But basically the problem that this patch is trying to solve is, clients might not always want to specify
"-resource-dir" in their compile commands, so scan-deps must auto-infer it correctly.
Reviewers: arphaman, Bigcheese, jkorous, dexonsmith, klimek
Reviewed By: Bigcheese
Subscribers: MaskRay, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69122
Summary: Working towards Johannes's suggestion for fixme, in Attributor's Noalias attribute deduction.
(ii) Check whether the value is captured in the scope using AANoCapture.
FIXME: This is conservative though, it is better to look at CFG and
// check only uses possibly executed before this call site.
A Reachability abstract attribute answers the question "does execution at point A potentially reach point B". If this question is answered with false for all other uses of the value that might be captured, we know it is not *yet* captured and can continue with the noalias deduction. Currently, information AAReachability provides is completely pessimistic.
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: uenoku, sstefan1, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D70233
Summary:
This is a preparatory cleanup before i add more
of this fold to deal with comparisons with non-zero.
In essence, the current lowering is:
```
Name: (X % C1) == 0 -> X * C3 <= C4
Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, 0
=>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%r = icmp ule i8 %n3, %C4
```
https://rise4fun.com/Alive/oqd
It kinda just works, really no weird edge-cases.
But it isn't all that great for when comparing with non-zero.
In particular, given `(X % C1) == C2`, there will be problems
in the always-false tautological case where `C2 u>= C1`:
https://rise4fun.com/Alive/pH3
That case is tautological, always-false:
```
Name: (X % Y) u>= Y
%o0 = urem i8 %x, %y
%r = icmp uge i8 %o0, %y
=>
%r = false
```
https://rise4fun.com/Alive/ofu
While we can't/shouldn't get such tautological case normally,
we do deal with non-splat vectors, so unless we want to give up
in this case, we need to fixup/short-circuit such lanes.
There are two lowering variants:
1. We can blend between whatever computed result and the correct tautological result
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
=>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%res = icmp ule i8 %n3, %C4
%r = select i1 %is_tautologically_false, i1 0, i1 %res
```
https://rise4fun.com/Alive/PjT5https://rise4fun.com/Alive/1KV
2. We can invert the comparison result
```
Name: (X % C1) == C2 -> X * C3 <= C4 || false
Pre: (C2 == 0 || C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1
%zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition
%o0 = urem i8 %x, C1
%r = icmp eq i8 %o0, C2
=>
%zz = and i8 C3, 0 ; and silence it from complaining about said reg
%C4 = -1 /u C1
%n0 = mul i8 %x, C3
%n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right
%n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right
%n3 = or i8 %n1, %n2 ; rotate right
%is_tautologically_false = icmp ule i8 C1, C2
%C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4
%res = icmp ule i8 %n3, %C4_fixed
%r = xor i1 %res, %is_tautologically_false
```
https://rise4fun.com/Alive/2xChttps://rise4fun.com/Alive/jpb5
3. We can expand into `and`/`or`:
https://rise4fun.com/Alive/WGnhttps://rise4fun.com/Alive/lcb5
Blend-one is likely better since we avoid having to load the
replacement from constant pool. `xor` is second best since
it's still pretty general. I'm not adding `and`/`or` variants.
Reviewers: RKSimon, craig.topper, spatel
Reviewed By: RKSimon
Subscribers: nick, hiraditya, xbolva00, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70051
The original bug report can be found
[here](https://github.com/clangd/clangd/issues/85)
Given the following code:
```c++
void function() {
auto Lambda = [](int a, double &b) {return 1.f;};
La^
}
```
Triggering the completion at `^` would show `(lambda)` before this patch
and would show signature `(int a, double &b) const`, build a snippet etc
with this patch.
Reviewers: sammccall
Reviewed by: sammccall
Differential revision: https://reviews.llvm.org/D70445
Summary: Ensure that breakpoint ivar is properly set in exception breakpoint resolver so that exception breakpoints set on dummy targets are resolved once real targets are created and run.
Reviewers: jingham
Reviewed By: jingham
Subscribers: teemperor, JDevlieghere, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D69880
Summary:
Instead of going to the debug_loc section directly, use new
DWARFDie::getLocations instead. This means that the code will now
automatically support debug_loclists sections.
This is the last usage of the old debug_loc methods, and they can now be
removed.
Reviewers: dblaikie, JDevlieghere, aprantl, SouraVX
Subscribers: hiraditya, probinson, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70534
Power9 has instructions to implement the semantics of SIGN_EXTEND_INREG for vector type.
Mark it as legal and add the match pattern.
Differential Revision: https://reviews.llvm.org/D69601
On RHEL, the OS tooling (ar, ranlib) is not deterministic by default.
Therefore, we cannot get bit-for-bit identical builds.
The goal of this patch is that it adds the flags required to force determinism.
Differential Revision: https://reviews.llvm.org/D64817
1. Currently only support the set of multilibs same to riscv-gnu-toolchain.
2. Fix testcase typo causes fail on Windows.
3. Fix testcases to set empty sysroot.
Reviewers: espindola, asb, kito-cheng, lenary
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D67508
Remove some cognitive load by renaming clang/Serialization/Module.h to
clang/Serialization/ModuleFile.h, since it declares the ModuleFile
class. This also makes editing a bit easier, since the basename of the
file no long conflicts with clang/Basic/Module.h, which declares the
Module class. Also move lib/Serialization/Module.cpp to
lib/Serialization/ModuleFile.cpp.
Fix a canonicalization problem for the newly added property accessor stubs that
was causing a wrong decl to be used for 'self' in the accessor's body farm.
Fix a crash when constructing a body farm for accessors of a property
that is declared and @synthesize'd in different (but related) interfaces.
Differential Revision: https://reviews.llvm.org/D70158
Add explicit setOperation actions for some to match their none
strict counterparts. This isn't required, but makes the code
self documenting that we didn't forget about strict fp. I've
used LibCall instead of Expand since that's more explicitly what
we want.
Only lrint/llrint/lround/llround are missing now.
Push the test separately ahead of time in order to find out whether
our Memory Sanitizer bots will be able to find the problem.
If not, I'll add a much more expensive test that repeats the current
test multiple times in order to show up on normal buildbots.
I really apologize for the potential temporary inconvenience!
I'll commit the fix as soon as I get the signal.
Differential Revision: https://reviews.llvm.org/D69962
float node
This patch add an option 'disable-strictnode-mutation' to prevent strict
node mutating to an normal node.
So we can make sure that the patch which sets strict-node as legal works
correctly.
Patch by Chen Liu(LiuChen3)
Differential Revision: https://reviews.llvm.org/D70226
The verification inside loop passes should be done under the
VerifyMemorySSA flag (enabled by EXPESIVE_CHECKS or explicitly with
opt), in order to not add to compile time during regular builds.
Summary:
This commit moves the `DiscardOutput` function in FuzzerIO to
FuzzerUtil, so fuchsia can have its own specialized version.
In fuchsia, accessing `/dev/null` is not supported, and there's nothing
similar to a file that discards everything that is written to it. The
way of doing something similar in fuchsia is by using `fdio_null_create`
and binding that to a file descriptor with `fdio_bind_to_fd`.
This change should fix one of the issues with the `-close_fd_mask` flag
in libfuzzer, in which closing stdout was not working due to
`fopen("/dev/null", "w")` returning `NULL`.
Reviewers: kcc, aarongreen
Subscribers: #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D69593
The regcall was making 32-bit mode pass things in xmm registers
which made 32-bit and 64-bit more similar. But I have upcoming
patches that require them to be separated anyway.
We don't have a full sysroot yet, so for now we only include compiler
support and compiler-rt builtins, the rest of the runtimes will get
enabled later.
Differential Revision: https://reviews.llvm.org/D70477
Summary:
This commit fixes part of the issues with stack unwinding in fuchsia for
arm64 and x86_64. It consists of multiple fixes:
(1) The cfa_offset calculation was wrong, instead of pointing to the
previous stack pointer, it was pointing to the current one. It worked in
most of the cases because the crashing functions already had a
prologue and had their cfa information relative to another register. The
fix consists on adding a constant that can be used to calculate the
crashing function's stack pointer, and base all the cfi information
relative to that offset.
(2) (arm64) Due to errors with the syntax for the dwarf information, most
of the `OP_NUM` macros were not working. The problem was that they were
referred to as `r##NUM` (like `r14`), when it should have been `x##num`
(like `x14`), or even without the x.
(3) (arm64) The link register was being considered a part of the main
registers (`r30`), when in the real struct it has its own field. Given
that the link register is in the same spot in the struct as r[30] would be,
and that C++ doesn't care about anything, the calculation was still correct.
(4) (x86_64) The stack doesn't need to be aligned to 16 bytes when we
jump to the trampoline function, but it needs to be before performing
call instructions. Encoding that logic in cfi information was tricky, so
we decided to make the cfa information relative to `rbp` and align `rsp`.
Note that this could have been done using another register directly,
but it seems cleaner to make a new fake stack frame.
There are some other minor changes like adding a `brk 1` instruction in
arm64 to make sure that we never return to the crash trampoline (similar to
what we do in x86_64).
Sadly this commit does not fix unwinding for all use cases for arm64.
Crashing functions that do not add information related to the return column in
their cfi information will fail to unwind due to a bug in libunwinder.
Reviewers: mcgrathr, jakehehrlich, phosek, kcc, aarongreen
Subscribers: aprantl, kristof.beyls, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D69579
We may end up with a case where we have a widenable branch above the loop, but not all widenable branches within the loop have been removed. Since a widenable branch inhibit SCEVs ability to reason about exit counts (by design), we have a tradeoff between effectiveness of this optimization and allowing future widening of the branches within the loop. LoopPred is thought to be one of the most important optimizations for range check elimination, so let's pay the cost.