Commit Graph

11 Commits

Author SHA1 Message Date
Dmitry Vyukov 92a3a2dc3e sanitizer_common: introduce kInvalidTid/kMainTid
Currently we have a bit of a mess related to tids:
 - sanitizers re-declare kInvalidTid multiple times
 - some call it kUnknownTid
 - implicit assumptions that main tid is 0
 - asan/memprof claim their tids need to fit into 24 bits,
   but this does not seem to be true anymore
 - inconsistent use of u32/int to store tids

Introduce kInvalidTid/kMainTid in sanitizer_common
and use them consistently.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101428
2021-04-30 15:58:05 +02:00
Dmitry Vyukov ed7bf7d73f tsan: refactor fork handling
Commit efd254b636 ("tsan: fix deadlock in pthread_atfork callbacks")
fixed another deadlock related to atfork handling.
But builders with DCHECKs enabled reported failures of
pthread_atfork_deadlock2.c and pthread_atfork_deadlock3.c tests
related to the fact that we hold runtime locks on interceptor exit:
https://lab.llvm.org/buildbot/#/builders/70/builds/6727
This issue is somewhat inherent to the current approach,
we indeed execute user code (atfork callbacks) with runtime lock held.

Refactor fork handling to not run user code (atfork callbacks)
with runtime locks held. This change does this by installing
own atfork callbacks during runtime initialization.
Atfork callbacks run in LIFO order, so the expectation is that
our callbacks run last, right before the actual fork.
This way we lock runtime mutexes around fork, but not around
user callbacks.

Extend tests to also install after fork callbacks just to cover
more scenarios. Some tests also started reporting real races
that we previously suppressed.

Also extend tests to cover fork syscall support.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101517
2021-04-30 08:48:20 +02:00
Tres Popp d1e08b124c Revert "tsan: refactor fork handling"
This reverts commit e1021dd1fd.
2021-04-28 14:08:33 +02:00
Dmitry Vyukov e1021dd1fd tsan: refactor fork handling
Commit efd254b636 ("tsan: fix deadlock in pthread_atfork callbacks")
fixed another deadlock related to atfork handling.
But builders with DCHECKs enabled reported failures of
pthread_atfork_deadlock2.c and pthread_atfork_deadlock3.c tests
related to the fact that we hold runtime locks on interceptor exit:
https://lab.llvm.org/buildbot/#/builders/70/builds/6727
This issue is somewhat inherent to the current approach,
we indeed execute user code (atfork callbacks) with runtime lock held.

Refactor fork handling to not run user code (atfork callbacks)
with runtime locks held. This change does this by installing
own atfork callbacks during runtime initialization.
Atfork callbacks run in LIFO order, so the expectation is that
our callbacks run last, right before the actual fork.
This way we lock runtime mutexes around fork, but not around
user callbacks.

Extend tests to also install after fork callbacks just to cover
more scenarios. Some tests also started reporting real races
that we previously suppressed.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101385
2021-04-27 22:37:27 +02:00
Evgenii Stepanov 5275d772da Revert "tsan: fix deadlock in pthread_atfork callbacks"
Tests fail on debug builders. See the forward fix in
https://reviews.llvm.org/D101385.

This reverts commit efd254b636.
2021-04-27 12:36:31 -07:00
Dmitry Vyukov efd254b636 tsan: fix deadlock in pthread_atfork callbacks
We take report/thread_registry locks around fork.
This means we cannot report any bugs in atfork handlers.
We resolved this by enabling per-thread ignores around fork.
This resolved some of the cases, but not all.
The added test triggers a race report from a signal handler
called from atfork callback, we reset per-thread ignores
around signal handlers, so we tried to report it and deadlocked.
But there are more cases: a signal handler can be called
synchronously if it's sent to itself. Or any other report
types would cause deadlocks as well: mutex misuse,
signal handler spoiling errno, etc.
Disable all reports for the duration of fork with
thr->suppress_reports and don't re-enable them around
signal handlers.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101154
2021-04-27 13:25:26 +02:00
Dmitry Vyukov 1624be938d tsan: fix leak of ThreadSignalContext memory mapping when destroying fibers
When creating and destroying fibers in tsan a thread state is created and destroyed. Currently, a memory mapping is leaked with each fiber (in __tsan_destroy_fiber). This causes applications with many short running fibers to crash or hang because of linux vm.max_map_count.

The root of this is that ThreadState holds a pointer to ThreadSignalContext for handling signals. The initialization and destruction of it is tied to platform specific events in tsan_interceptors_posix and missed when destroying a fiber (specifically, SigCtx is used to lazily create the ThreadSignalContext in tsan_interceptors_posix). This patch cleans up the memory by makinh the ThreadState create and destroy the ThreadSignalContext.

The relevant code causing the leak with fibers is the fiber destruction:

void FiberDestroy(ThreadState *thr, uptr pc, ThreadState *fiber) {
  FiberSwitchImpl(thr, fiber);
  ThreadFinish(fiber);
  FiberSwitchImpl(fiber, thr);
  internal_free(fiber);
}

Author: Florian
Reviewed-in: https://reviews.llvm.org/D76073
2020-04-11 10:30:31 +02:00
Jonas Devlieghere 6430707196 Revert "tsan: fix leak of ThreadSignalContext for fibers"
Temporarily revert "tsan: fix leak of ThreadSignalContext for fibers"
because it breaks the LLDB bot on GreenDragon.

This reverts commit 93f7743851.
This reverts commit d8a0f76de7.
2020-03-25 19:18:38 -07:00
Dmitry Vyukov d8a0f76de7 tsan: fix leak of ThreadSignalContext for fibers
When creating and destroying fibers in tsan a thread state
is created and destroyed. Currently, a memory mapping is
leaked with each fiber (in __tsan_destroy_fiber).
This causes applications with many short running fibers
to crash or hang because of linux vm.max_map_count.

The root of this is that ThreadState holds a pointer to
ThreadSignalContext for handling signals. The initialization
and destruction of it is tied to platform specific events
in tsan_interceptors_posix and missed when destroying a fiber
(specifically, SigCtx is used to lazily create the
ThreadSignalContext in tsan_interceptors_posix). This patch
cleans up the memory by inverting the control from the
platform specific code calling the generic ThreadFinish to
ThreadFinish calling a platform specific clean-up routine
after finishing a thread.

The relevant code causing the leak with fibers is the fiber destruction:

void FiberDestroy(ThreadState *thr, uptr pc, ThreadState *fiber) {
  FiberSwitchImpl(thr, fiber);
  ThreadFinish(fiber);
  FiberSwitchImpl(fiber, thr);
  internal_free(fiber);
}

I would appreciate feedback if this way of fixing the leak is ok.
Also, I think it would be worthwhile to more closely look at the
lifecycle of ThreadState (i.e. it uses no constructor/destructor,
thus requiring manual callbacks for cleanup) and how OS-Threads/user
level fibers are differentiated in the codebase. I would be happy to
contribute more if someone could point me at the right place to
discuss this issue.

Reviewed-in: https://reviews.llvm.org/D76073
Author: Florian (Florian)
2020-03-25 17:05:46 +01:00
Dmitry Vyukov 2dcbdba854 tsan: fix pthread_detach with called_from_lib suppressions
Generally we ignore interceptors coming from called_from_lib-suppressed libraries.
However, we must not ignore critical interceptors like e.g. pthread_create,
otherwise runtime will lost track of threads.
pthread_detach is one of these interceptors we should not ignore as it affects
thread states and behavior of pthread_join which we don't ignore as well.
Currently we can produce very obscure false positives. For more context see:
https://groups.google.com/forum/#!topic/thread-sanitizer/ecH2P0QUqPs
The added test captures this pattern.

While we are here rename ThreadTid to ThreadConsumeTid to make it clear that
it's not just a "getter", it resets user_id to 0. This lead to confusion recently.

Reviewed in https://reviews.llvm.org/D74828
2020-02-26 12:59:49 +01:00
Nico Weber 5a3bb1a4d6 compiler-rt: Rename .cc file in lib/tsan/rtl to .cpp
Like r367463, but for tsan/rtl.

llvm-svn: 367564
2019-08-01 14:22:42 +00:00