Prevent the following pathological behavior:
Since memory access handling is not synchronized with DoReset,
a thread running concurrently with DoReset can leave a bogus shadow value
that will be later falsely detected as a race. For such false races
RestoreStack will return false and we will not report it.
However, consider that a thread leaves a whole lot of such bogus values
and these values are later read by a whole lot of threads.
This will cause massive amounts of ReportRace calls and lots of
serialization. In very pathological cases the resulting slowdown
can be >100x. This is very unlikely, but it was presumably observed
in practice: https://github.com/google/sanitizers/issues/1552
If this happens, previous access sid+epoch will be the same for all of
these false races b/c if the thread will try to increment epoch, it will
notice that DoReset has happened and will stop producing bogus shadow
values. So, last_spurious_race is used to remember the last sid+epoch
for which RestoreStack returned false. Then it is used to filter out
races with the same sid+epoch very early and quickly.
It is of course possible that multiple threads left multiple bogus shadow
values and all of them are read by lots of threads at the same time.
In such case last_spurious_race will only be able to deduplicate a few
races from one thread, then few from another and so on. An alternative
would be to hold an array of such sid+epoch, but we consider such scenario
as even less likely.
Note: this can lead to some rare false negatives as well:
1. When a legit access with the same sid+epoch participates in a race
as the "previous" memory access, it will be wrongly filtered out.
2. When RestoreStack returns false for a legit memory access because it
was already evicted from the thread trace, we will still remember it in
last_spurious_race. Then if there is another racing memory access from
the same thread that happened in the same epoch, but was stored in the
next thread trace part (which is still preserved in the thread trace),
we will also wrongly filter it out while RestoreStack would actually
succeed for that second memory access.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130269
We used to deduplicate based on the race address to prevent lots
of repeated reports about the same race.
But now we clear the shadow for the racy address in DoReportRace:
// This prevents trapping on this address in future.
for (uptr i = 0; i < kShadowCnt; i++)
StoreShadow(&shadow_mem[i], i == 0 ? Shadow::kRodata : Shadow::kEmpty);
It should have the same effect of not reporting duplicates
(and actually better because it's automatically reset when the memory is reallocated).
So drop the address deduplication code. Both simpler and faster.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130240
ClearShadowMemoryForContextStack assumes that context contains the stack
bounds. This is not true for a context from getcontext or oucp of
swapcontext.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D130218
This is a NFC change to factor out GCD worker thread registration via
the pthread introspection hook.
In a follow-up change we also want to register GCD workers for ASan to
make sure threads are registered before we attempt to print reports on
them.
rdar://93276353
Differential Revision: https://reviews.llvm.org/D126351
This patch, on top of D120048 <https://reviews.llvm.org/D120048>, supports
GetTls on Solaris 11.3 and Illumos that lack `dlpi_tls_modid`. It's the
same method originally used in D91605 <https://reviews.llvm.org/D91605>,
but integrated into `GetStaticTlsBoundary`.
Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Differential Revision: https://reviews.llvm.org/D120059
sanitizer_platform_limits_posix.h defines `__sanitizer_XDR ` if `SANITIZER_LINUX && !SANITIZER_ANDROID`, but sanitizer_platform_limits_posix.cpp tries to check it if `HAVE_RPC_XDR_H`. This coincidentally works because macOS has a broken <rpc/xdr.h> which causes `HAVE_RPC_XDR_H` to be 0, but if <rpc/xdr.h> is fixed then clang fails to compile on macOS. Restore the platform checks so that <rpc/xdr.h> can be fixed on macOS.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D130060
Depends on D129371.
It survived all GCC ASan tests.
Changes are trivial and mostly "borrowed" RISC-V logics, except that a different SHADOW_OFFSET is used.
Reviewed By: SixWeining, MaskRay, XiaodongLoong
Differential Revision: https://reviews.llvm.org/D129418
Initial libsanitizer support for LoongArch. It survived all GCC UBSan tests.
Major changes:
1. LoongArch port of Linux kernel only supports `statx` for `stat` and its families. So we need to add `statx_to_stat` and use it for `stat`-like libcalls. The logic is "borrowed" from Glibc.
2. `sanitizer_syscall_linux_loongarch64.inc` is mostly duplicated from RISC-V port, as the syscall interface is almost same.
Reviewed By: SixWeining, MaskRay, XiaodongLoong, vitalybuka
Differential Revision: https://reviews.llvm.org/D129371
This patches exposed existing incorectness of swapcontext imlementation.
swapcontext does not set oucp->uc_stack. Unpoisoning works if ucp is
from makecontext, but may try to use garbage pointers if it's from
previos swapcontext or from getcontext. Existing limit reduces
probability of garbage pointers are used.
I restore behavour which we had for years, and will look to improve
swapcontext support.
This reverts commit d0751c9725.
Use the FreeBSD AArch64 memory layout values when building for it.
These are based on the x86_64 values, scaled to take into account the
larger address space on AArch64.
Reviewed by: vitalybuka
Differential Revision: https://reviews.llvm.org/D125758
If lots of threads do lots of malloc/free and they overflow
per-pthread DenseSlabAlloc cache, it causes lots of contention:
31.97% race.old race.old [.] __sanitizer::StaticSpinMutex::LockSlow
17.61% race.old race.old [.] __tsan_read4
10.77% race.old race.old [.] __tsan::SlotLock
Optimize DenseSlabAlloc to use a lock-free stack of batches of nodes.
This way we don't take any locks in steady state at all and do only
1 push/pop per Refill/Drain.
Effect on the added benchmark:
$ TIME="%e %U %S %M" time ./test.old 36 5 2000000
34.51 978.22 175.67 5833592
32.53 891.73 167.03 5790036
36.17 1005.54 201.24 5802828
36.94 1004.76 226.58 5803188
$ TIME="%e %U %S %M" time ./test.new 36 5 2000000
26.44 720.99 13.45 5750704
25.92 721.98 13.58 5767764
26.33 725.15 13.41 5777936
25.93 713.49 13.41 5791796
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130002
Because the call to `dlerror()` may actually want to print something, which turns into a deadlock
as showcased in #49223.
Instead rely on further call to dlsym to clear `dlerror` internal state if they
need to check the return status.
Differential Revision: https://reviews.llvm.org/D128992
On a mips64el-linux-gnu system, the dynamic linker arranges TLS blocks
like:
[0] 0xfff7fe9680..0xfff7fe9684, align = 0x4
[1] 0xfff7fe9688..0xfff7fe96a8, align = 0x8
[2] 0xfff7fe96c0..0xfff7fe9e60, align = 0x40
[3] 0xfff7fe9e60..0xfff7fe9ef8, align = 0x8
Note that the dynamic linker can only put [1] at 0xfff7fe9688, not
0xfff7fe9684 or it will be misaligned. But we were comparing the
distance between two blocks with the alignment of the previous range,
causing GetStaticTlsBoundary fail to merge the consecutive blocks.
Compare against the alignment of the latter range to fix the issue.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D129112
Since the introduction of GoogleTest sharding in D122251
<https://reviews.llvm.org/D122251>, some of the Solaris sanitizer tests
have been running extremly long (up to an hour) while they took mere
seconds before. Initial investigation suggests that massive lock
contention in Solaris procfs is involved here.
However, there's an easy way to somewhat reduce the impact: while the
current `ReadProcMaps` uses `ReadFileToBuffer` to read `/proc/self/xmap`,
that function primarily caters to Linux procfs reporting file sizes of 0
while the size on Solaris is accurate. This patch makes use of that,
reducing the number of syscalls involved and reducing the runtime of
affected tests by a factor of 4.
Besides, it handles shared mappings and doesn't call `readlink` for unnamed
map entries.
Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D129837
The reserve constructor was removed in 44f55509d7
but this one was missed. As a result, we attempt to iterate through 1024 threads
each time, most of which are 0.
Differential Revision: https://reviews.llvm.org/D129897
We already link libunwind explicitly so avoid trying to link toolchain's
default libunwind which may be missing. This matches what we already do
for libcxx and libcxxabi.
Differential Revision: https://reviews.llvm.org/D129472
Add weak definitions for the load/store callbacks.
This matches the weak definitions for all other SanitizerCoverage
callbacks.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D129801
Callers of TraceSwitchPart expect that TraceAcquire will always succeed
after the call. It's possible that TryTraceFunc/TraceMutexLock in TraceSwitchPart
that restore the current stack/mutexset filled the trace part exactly up
to the TracePart::kAlignment gap and the next TraceAcquire won't succeed.
Skip the alignment gap after writing initial stack/mutexset to avoid that.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D129777
New version of Intel LAM patches
(https://lore.kernel.org/linux-mm/20220712231328.5294-1-kirill.shutemov@linux.intel.com/)
uses a different interface based on arch_prctl():
- arch_prctl(ARCH_GET_UNTAG_MASK, &mask) returns the current mask for
untagging the pointers. We use it to detect kernel LAM support.
- arch_prctl(ARCH_ENABLE_TAGGED_ADDR, nr_bits) enables pointer tagging
for the current process.
Because __NR_arch_prctl is defined in different headers, and no other
platforms need it at the moment, we only declare internal_arch_prctl()
on x86_64.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D129645
Hwasan includes instructions in the prologue that mix the PC and SP and store
it into the stack ring buffer stored at __hwasan_tls. This is a thread_local
global exposed from the hwasan runtime. However, if TLS-mechanisms or the
hwasan runtime haven't been setup yet, it will be invalid to access __hwasan_tls.
This is the case for Fuchsia where we instrument libc, so some functions that
are instrumented but can run before hwasan initialization will incorrectly
access this global. Additionally, libc cannot have any TLS variables, so we
cannot weakly define __hwasan_tls until the runtime is loaded.
A way we can work around this is by moving the instructions into a hwasan
function that does the store into the ring buffer and creating a weak definition
of that function locally in libc. This way __hwasan_tls will not actually be
referenced. This is not our long-term solution, but this will allow us to roll
out hwasan in the meantime.
This patch includes:
- A new llvm flag for choosing to emit a libcall rather than instructions in the
prologue (off by default)
- The libcall for storing into the ringbuffer (__hwasan_add_frame_record)
Differential Revision: https://reviews.llvm.org/D128387
Hwasan includes instructions in the prologue that mix the PC and SP and store
it into the stack ring buffer stored at __hwasan_tls. This is a thread_local
global exposed from the hwasan runtime. However, if TLS-mechanisms or the
hwasan runtime haven't been setup yet, it will be invalid to access __hwasan_tls.
This is the case for Fuchsia where we instrument libc, so some functions that
are instrumented but can run before hwasan initialization will incorrectly
access this global. Additionally, libc cannot have any TLS variables, so we
cannot weakly define __hwasan_tls until the runtime is loaded.
A way we can work around this is by moving the instructions into a hwasan
function that does the store into the ring buffer and creating a weak definition
of that function locally in libc. This way __hwasan_tls will not actually be
referenced. This is not our long-term solution, but this will allow us to roll
out hwasan in the meantime.
This patch includes:
- A new llvm flag for choosing to emit a libcall rather than instructions in the
prologue (off by default)
- The libcall for storing into the ringbuffer (__hwasan_record_frame_record)
Differential Revision: https://reviews.llvm.org/D128387
This is to finish the change started by D125816 , D126263 and D126577 (replace SANITIZER_MAC by SANITIZER_APPLE).
Dropping definition of SANITIZER_MAC completely, to remove any possible confusion.
Differential Revision: https://reviews.llvm.org/D129502
It is generally not a good idea to mix usage of glibc headers and Linux UAPI
headers (https://sourceware.org/glibc/wiki/Synchronizing_Headers). In glibc
since 7eae6a91e9b1670330c9f15730082c91c0b1d570 (milestone: 2.36), sys/mount.h
defines `fsconfig_command` which conflicts with linux/mount.h:
.../usr/include/linux/mount.h:95:6: error: redeclaration of ‘enum fsconfig_command’
Remove #include <linux/fs.h> which pulls in linux/mount.h. Expand its 4 macros manually.
Android sys/mount.h doesn't define BLKBSZGET and it still needs linux/fs.h.
In the long term we should move Linux specific definitions to sanitizer_platform_limits_linux.cpp
but this commit is easy to cherry pick into older compiler-rt releases.
Fix https://github.com/llvm/llvm-project/issues/56421
Reviewed By: #sanitizers, vitalybuka, zatrazz
Differential Revision: https://reviews.llvm.org/D129471
It is generally not a good idea to mix usage of glibc headers and Linux UAPI
headers (https://sourceware.org/glibc/wiki/Synchronizing_Headers). In glibc
since 7eae6a91e9b1670330c9f15730082c91c0b1d570 (milestone: 2.36), sys/mount.h
defines `fsconfig_command` which conflicts with linux/mount.h:
.../usr/include/linux/mount.h:95:6: error: redeclaration of ‘enum fsconfig_command’
Remove #include <linux/fs.h> which pulls in linux/mount.h. Expand its 4 macros manually.
Fix https://github.com/llvm/llvm-project/issues/56421
Reviewed By: #sanitizers, vitalybuka, zatrazz
Differential Revision: https://reviews.llvm.org/D129471
After https://reviews.llvm.org/D129237, the assumption
that any non-null data contains a valid vmar handle is no
longer true. Generally this code here needs cleanup, but
in the meantime this fixes errors on Fuchsia.
Differential Revision: https://reviews.llvm.org/D129331
This is a follow up to D118200 which applies a similar cleanup to
headers when using in-tree libc++ to avoid accidentally picking up
the system headers.
Differential Revision: https://reviews.llvm.org/D128035
This is a follow up to D118200 which applies a similar cleanup to
headers when using in-tree libc++ to avoid accidentally picking up
the system headers.
Differential Revision: https://reviews.llvm.org/D128035
We currently have an option to select C++ ABI and C++ library for tests
but there are runtimes that use C++ library, specifically ORC and XRay,
which aren't covered by existing options. This change introduces a new
option to control the use of C++ libray for these runtimes.
Ideally, this option should become the default way to select C++ library
for all of compiler-rt replacing the existing options (the C++ ABI
option could remain as a hidden internal option).
Differential Revision: https://reviews.llvm.org/D128036
While investigating another issue, I noticed that `MaybeReexec()` never
actually "re-executes via `execv()`" anymore. `DyldNeedsEnvVariable()`
only returned true on macOS 10.10 and below.
Usually, I try to avoid "unnecessary" cleanups (it's hard to be certain
that there truly is no fallout), but I decided to do this one because:
* I initially tricked myself into thinking that `MaybeReexec()` was
relevant to my original investigation (instead of being dead code).
* The deleted code itself is quite complicated.
* Over time a few other things were mushed into `MaybeReexec()`:
initializing `MonotonicNanoTime()`, verifying interceptors are
working, and stripping the `DYLD_INSERT_LIBRARIES` env var to avoid
problems when forking.
* This platform-specific thing leaked into `sanitizer_common.h`.
* The `ReexecDisabled()` config nob relies on the "strong overrides weak
pattern", which is now problematic and can be completely removed.
* `ReexecDisabled()` actually hid another issue with interceptors not
working in unit tests. I added an explicit `verify_interceptors`
(defaults to `true`) option instead.
Differential Revision: https://reviews.llvm.org/D129157