Inside `ExprEngine::VisitLambdaExpr()` we wasn't prepared for a
copy elided initialized capture's `InitExpr`. This patch teaches
the analyzer how to handle such situation.
Differential Revision: https://reviews.llvm.org/D131784
The per-PSB packet decoding logic was wrong because it was assuming that pt_insn_get_sync_offset was being udpated after every PSB. Silly me, that is not true. It returns the offset of the PSB packet after invoking pt_insn_sync_forward regardless of how many PSBs are visited later. Instead, I'm now following the approach described in https://github.com/intel/libipt/blob/master/doc/howto_libipt.md#parallel-decode for parallel decoding, which is basically what we need.
A nasty error that happened because of this is that when we had two PSBs (A and B), the following was happening
1. PSB A was processed all the way up to the end of the trace, which includes PSB B.
2. PSB B was then processed until the end of the trace.
The instructions emitted by step 2. were also emitted as part of step 1. so our trace had duplicated chunks. This problem becomes worse when you many PSBs.
As part of making sure this diff is correct, I added some other features that are very useful.
- Added a "synchronization point" event to the TraceCursor, so we can inspect when PSBs are emitted.
- Removed the single-thread decoder. Now the per-cpu decoder and single-thread decoder use the same code paths.
- Use the query decoder to fetch PSBs and timestamps. It turns out that the pt_insn_sync_forward of the instruction decoder can move past several PSBs (this means that we could skip some TSCs). On the other hand, the pt_query_sync_forward method doesn't skip PSBs, so we can get more accurate sync events and timing information.
- Turned LibiptDecoder into PSBBlockDecoder, which decodes single PSB blocks. It is the fundamental processing unit for decoding.
- Added many comments, asserts and improved error handling for clarity.
- Improved DecodeSystemWideTraceForThread so that a TSC is emitted always before a cpu change event. This was a bug that was annoying me before.
- SplitTraceInContinuousExecutions and FindLowestTSCInTrace are now using the query decoder, which can identify precisely each PSB along with their TSCs.
- Added an "only-events" option to the trace dumper to inspect only events.
I did extensive testing and I think we should have an in-house testing CI. The LLVM buildbots are not capable of supporting testing post-mortem traces of hundreds of megabytes. I'll leave that for later, but at least for now the current tests were able to catch most of the issues I encountered when doing this task.
A sample output of a program that I was single stepping is the following. You can see that only one PSB is emitted even though stepping happened!
```
thread #1: tid = 3578223
0: (event) trace synchronization point [offset = 0x0xef0]
a.out`main + 20 at main.cpp:29:20
1: 0x0000000000402479 leaq -0x1210(%rbp), %rax
2: (event) software disabled tracing
3: 0x0000000000402480 movq %rax, %rdi
4: (event) software disabled tracing
5: (event) software disabled tracing
6: 0x0000000000402483 callq 0x403bd4 ; std::vector<int, std::allocator<int>>::vector at stl_vector.h:391:7
7: (event) software disabled tracing
a.out`std::vector<int, std::allocator<int>>::vector() at stl_vector.h:391:7
8: 0x0000000000403bd4 pushq %rbp
9: (event) software disabled tracing
10: 0x0000000000403bd5 movq %rsp, %rbp
11: (event) software disabled tracing
```
This is another trace of a long program with a few PSBs.
```
(lldb) thread trace dump instructions -E -f thread #1: tid = 3603082
0: (event) trace synchronization point [offset = 0x0x80]
47417: (event) software disabled tracing
129231: (event) trace synchronization point [offset = 0x0x800]
146747: (event) software disabled tracing
246076: (event) software disabled tracing
259068: (event) trace synchronization point [offset = 0x0xf78]
259276: (event) software disabled tracing
259278: (event) software disabled tracing
no more data
```
Differential Revision: https://reviews.llvm.org/D131630
Summary:
The implementation of _Unwind_FindEnclosingFunction(void *ip) takes the context of itself and then uses the context to get the info of the function enclosing ip. This approach does not work for AIX because on AIX, the TOC base in GPR2 is used as the base for calculating relative addresses. Since _Unwind_FindEnclosingFunction() may be in a different shared lib than the function containing ip, their TOC bases can be different. Therefore, using the value of GPR2 in the context from _Unwind_FindEnclosingFunction() as the base results in incorrect addresses. On the other hand, the start address of a function is available in the traceback table following the instructions of each function on AIX. To get to the traceback table, search a word of 0 starting from ip and the traceback table is located after the word 0. This patch implements _Unwind_FindEnclosingFunction() for AIX by obtaining the function start address from its traceback table.
Reviewed by: compnerd, MaskRay, libunwind
Differential Revision: https://reviews.llvm.org/D131709
DWARF v5 6.2.4 The Line Number Program Header says:
> The first entry is the current directory of the compilation. Each additional
> path entry is either a full path name or is relative to the current directory of
> the compilation.
When forming a path, relative DW_AT_comp_dir and directories[0] are not supposed
to be joined together. Fix getFileNameByIndex to special case DWARF v5 DirIdx == 0.
Reviewed By: #debug-info, dblaikie
Differential Revision: https://reviews.llvm.org/D131804
The linker may convert such an ADD into a LEA, so we must not
use the EFLAGS output.
This causes miscompiles with -fsanitize=null after
bacdf80f42 added
llvm.threadlocal.address -- previously, global variables were known to
be non-null, but the intrinsic is not currently known to return
nonnull. (That should be corrected, but it shouldn't've caused
miscompiles!)
Differential Revision: https://reviews.llvm.org/D131716
`AllOfType` is a type constraint that satisfies all given type
constraints and `ConfinedType` is a type that satisfies additional
predicates. These shorthands simplify type constraint definition mostly
by removing the need to deal with `myType.predicate` manipulation.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D131788
For generated assembly debug info, MCDwarfLineTableHeader::CompilationDir is an
unmapped path set in MCContext::setGenDwarfRootFile. Remap it.
A relative destination path of -fdebug-prefix-map= exposes a llvm-dwarfdump bug
which joins relative DW_AT_comp_dir and directories[0].
Fix https://github.com/llvm/llvm-project/issues/56609
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D131749
This patch supports SPIR-V capabilities and extensions. In addition,
it inserts decorations related to MIFlags and improves support of switches.
Five tests are included to demonstrate the improvement.
Differential Revision: https://reviews.llvm.org/D131221
Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
The flag accidentally used Joined<> instead of Flag<>.
Previously, `--warn-dylib-install-namefoobarbaz` would be accepted and
had the same effect as `-warn-dylib-install-name`. Now the flag only
works if no suffix is attached to it, as originally intended.
Also fix a typo in the flag's help text.
Differential Revision: https://reviews.llvm.org/D131781
The SemanticsContext is needed to analyze expression later in the
lowering for directive languages. This patch allows to keep a reference of
the SemanticsContext in the LoweringBridge.
Building block for D131765
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D131764
Allows for even more savings in the binary image while simultaneously removing the name of the offending stack variable.
Depends on D131631
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D131728
As a follow-up of e8578968f6 which replaced the
callers to use the C++17 equivalents, remove the equivalents from
`STLForwardCompat.h` entirely and their corresponding tests.
Differential Revision: https://reviews.llvm.org/D131769
To accurately measure the size of sprintf in a finished binary, the
easiest method is to simply build a binary with and without sprintf.
This patch adds an integration test that can be built with and without
sprintf, as well as targets to build it.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D131735
RBE is currently broken due to the RBE container being too old and not supporting C++17.
The bots have already stopped using --config=rbe.
Differential Revision: https://reviews.llvm.org/D131722
Always emit the TagType for RecordInfo in YAML output. Previously this omitted the type for "struct", considering it the default. But records in C++ don't really have a default type so always emitting this is more clear.
Emit IsTypeDef in YAML. Previously this existed only in the Representation but was never written. Additionally, adds IsTypeDef to the record merge operation which was clearing it (all RecordInfo structures are merged with am empty RecordInfo during the reduce phase).
Reviewed By: paulkirth
Differential Revision: https://reviews.llvm.org/D131739
When using `clang-doc --format=html` it will crash on startup because of an assertion doing a self-assignment of a `SmallString`. This patch removes the self-assignment by creating an intermediate copy.
Reviewed By: paulkirth
Differential Revision: https://reviews.llvm.org/D131793
The end-to-end test for this new feature also exposed a bug
in LLVM IR lowering (since then, fixed), where we need to account
for the min-poison bit as extra argument.
declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D131712
We were dereferencing an empty Optional if IgnoreErrors was true and the
stat failed.
rdar://60887887
Differential Revision: https://reviews.llvm.org/D131791
The value of the attribute is a size in bytes. It has the effect of
suppressing inlining of functions whose stacksizes exceed the given value.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D129904
IncludeCleaner.RecursiveInclusion and IncludeCleaner.IWYUPragmaExport tests don't check referenced files list, so we don't need to call findReferencedFiles() there.
Reviewed By: kbobyrev
Differential Revision: https://reviews.llvm.org/D131706
This was done by adding --abort-on-invalid-reduction to remove-function-bodies-used-in-globals.ll and fixing the fallout.
Aliases must have a GlobalValue or ConstantExpr aliasee and the aliasee must be a definition if it's a GlobalValue.
Don't RAUW functions with null if there's an alias pointing to it, and similarly don't delete the body of a function.
Don't delete the entire body of a function when reducing blocks, preserve at least one block.
Also make debugging these sorts of things easier by dumping the module when --abort-on-invalid-reduction triggers.
Reviewed By: regehr
Differential Revision: https://reviews.llvm.org/D131505
When targeting macOS Ventura, ld64 will use authenticated fixups for
x86_64 as well as arm64 (where that has always been the case). This
results in test failures when using an Xcode 14 toolchain on an Intel
mac running macOS Ventura:
Failed Tests (3):
lldb-api :: commands/target/basic/TestTargetCommand.py
lldb-api :: lang/c/global_variables/TestGlobalVariables.py
lldb-api :: lang/cpp/char8_t/TestCxxChar8_t.py
Rather than trying to come up with a sophisticated decorator based off
the deployment target, I marked them all as skipped with a comment
explaining why.
Differential revision: https://reviews.llvm.org/D131741
We're seeing non-determinism with loading sample profiles. It seems to
be related to the order in which we merge FunctionSamples in
promoteMergeNotInlinedContextSamples(). Use a MapVector to iterate over
NonInlinedCallSites in the order entries were inserted.
Reviewed By: wenlei, davidxl
Differential Revision: https://reviews.llvm.org/D131592
The code was relicensed by its owner (Unicode.org) a long time back,
but we still had the old (problematic) license in our fork.
Note that the source files have not been distributed from unicode.org
since 2009 (due to being buggy and unmaintained upstream), but they
were given this license before that.
Fixes https://github.com/llvm/llvm-project/issues/32309
Differential Revision: https://reviews.llvm.org/D66390
This patch restructures `DataflowAnalysisOptions` and `TransferOptions` to use `llvm::Optional`, in preparation for adding more sub-options to the `ContextSensitiveOptions` struct introduced here.
Reviewed By: sgatev, xazax.hun
Differential Revision: https://reviews.llvm.org/D131779
The LLVM intrinsic has a bool flag `is_int_min_poison` that needs to be
set.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D131785
Before this patch type traits are checked in Parser, so use type traits
directly did not cause assertion faults. However if type traits are initialized
from a template, we didn't perform arity checks before evaluating. This
patch moves arity checks from Parser to Sema, and performing arity
checks in Sema actions, so type traits get checked corretly.
Crash input:
```
template<class... Ts> bool b = __is_constructible(Ts...);
bool x = b<>;
```
After this patch:
```
clang/test/SemaCXX/type-trait-eval-crash-issue-57008.cpp:5:32: error: type trait requires 1 or more arguments; have 0 arguments
template<class... Ts> bool b = __is_constructible(Ts...);
^~~~~~~~~~~~~~~~~~
clang/test/SemaCXX/type-trait-eval-crash-issue-57008.cpp:6:10: note: in instantiation of variable template specialization 'b<>' requested here
bool x = b<>;
^
1 error generated.
```
See https://godbolt.org/z/q39W78hsK.
Fixes https://github.com/llvm/llvm-project/issues/57008
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D131423
The goal is to reduce the size of the MSAN with track origins binary, by making
the variable name locations constant which will allow the linker to compress
them.
Follows: https://reviews.llvm.org/D131415
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D131631
Introduce two different failure propagation mode in the Transform
dialect's Sequence operation. These modes specify whether silenceable
errors produced by nested ops are immediately propagated, thus stopping
the sequence, or suppressed. The latter is useful in end-to-end
transform application scenarios where the user cannot correct the
transformation, but it is robust enough to silenceable failures. It
can be combined with the "alternatives" operation. There is
intentionally no default value to avoid favoring one mode over the
other.
Downstreams can update their tests using:
S='s/sequence \(%.*\) {/sequence \1 failures(propagate) {/'
T='s/sequence {/sequence failures(propagate) {/'
git grep -l transform.sequence | xargs sed -i -e "$S"
git grep -l transform.sequence | xargs sed -i -e "$T"
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D131774
This refactors LTO compile to look more like COFF, where cache hits and misses are all funneled through the same code path.
Previously, cache hits were *not* being saved to -object_path_lto, which led to them sometimes falling out of the cache before dsymutil could process them. As a side effect of the refactor, cached objects are now saved with -save-temps as well, which seems desirable.
(Deleted lld/test/MachO/lto-cache-dsymutil.ll and rolled it into lld/test/MachO/lto-object-path.ll, since the cache-only, non object path approach is unreliable anyway).
Differential Revision: https://reviews.llvm.org/D131624
This fixes compilation errors in the following code with
latest libc++ and libstdc++:
```cpp
int main() {
std::tuple<std::string> s;
std::tuple<std::string_view> sv;
s < sv;
}
```
See https://gcc.godbolt.org/z/vGvEhxWvh for the error.
This used to happen because repeated template instantiations during
C++20 constraint satisfaction will cause a call to `RebuildCXXRewrittenBinaryOperator`
with `PerformADL` set to false, see 59351fe340/clang/lib/Sema/TreeTransform.h (L2737)
Committing urgently to unbreak breakages we see in our configurations.
The change seems relatively safe and the right thing to do.
However, I will follow-up with tests and investigation on whether we
need to do this extra semantic checking in the first place next week.
Implement tanf function correctly rounded for all rounding modes.
We use the range reduction that is shared with `sinf`, `cosf`, and `sincosf`:
```
k = round(x * 32/pi) and y = x * (32/pi) - k.
```
Then we use the tangent of sum formula:
```
tan(x) = tan((k + y)* pi/32) = tan((k mod 32) * pi / 32 + y * pi/32)
= (tan((k mod 32) * pi/32) + tan(y * pi/32)) / (1 - tan((k mod 32) * pi/32) * tan(y * pi/32))
```
We need to make a further reduction when `k mod 32 >= 16` due to the pole at `pi/2` of `tan(x)` function:
```
if (k mod 32 >= 16): k = k - 31, y = y - 1.0
```
And to compute the final result, we store `tan(k * pi/32)` for `k = -15..15` in a table of 32 double values,
and evaluate `tan(y * pi/32)` with a degree-11 minimax odd polynomial generated by Sollya with:
```
> P = fpminimax(tan(y * pi/32)/y, [|0, 2, 4, 6, 8, 10|], [|D...|], [0, 1.5]);
```
Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf
CORE-MATH reciprocal throughput : 18.586
System LIBC reciprocal throughput : 50.068
LIBC reciprocal throughput : 33.823
LIBC reciprocal throughput : 25.161 (with `-msse4.2` flag)
LIBC reciprocal throughput : 19.157 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf --latency
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 55.630
System LIBC latency : 106.264
LIBC latency : 96.060
LIBC latency : 90.727 (with `-msse4.2` flag)
LIBC latency : 82.361 (with `-mfma` flag)
```
Reviewed By: orex
Differential Revision: https://reviews.llvm.org/D131715
STLForwardCompat.h defines several utilities and type traits to mimic that of
the ones in the C++17 standard library. Now that LLVM is built with the C++17
standards mode, remove use of these equivalents in favor of the ones from the
standard library.
Differential Revision: https://reviews.llvm.org/D131717