Commit Graph

14321 Commits

Author SHA1 Message Date
Lang Hames 6d8c63946c Revert "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."
Reverts commit fe1fa43f16 while I investigate
failures on Linux.
2021-07-21 09:22:55 +10:00
Lang Hames fe1fa43f16 [ORC][ORC-RT] Add initial native-TLV support to MachOPlatform.
Adds code to LLVM (MachOPlatform) and the ORC runtime to support native MachO
thread local variables. Adding new TLVs to a JITDylib at runtime is supported.

On the LLVM side MachOPlatform is updated to:

1. Identify thread local variables in the LinkGraph and lower them to GOT
accesses to data in the __thread_data or __thread_bss sections.

2. Merge and report the address range of __thread_data and thread_bss sections
to the runtime.

On the ORC runtime a MachOTLVManager class introduced which records the address
range of thread data/bss sections, and creates thread-local instances from the
initial data on demand. An orc-runtime specific tlv_get_addr implementation is
included which saves all register state then calls the MachOTLVManager to get
the address of the requested variable for the current thread.
2021-07-21 09:10:10 +10:00
Florian Mayer 98687aa0d6 [NFC] run clang-format on hwasan use-after-scope tests.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D106259
2021-07-20 10:26:26 +01:00
Florian Mayer f3f287f0f6 [hwasan] [NFC] copy and disable ASAN tests to hwasan.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D106159
2021-07-20 10:12:14 +01:00
Dmitry Vyukov 3f981fc186 sanitizer_common: add new mutex
We currently have 3 different mutexes:
 - RWMutex
 - BlockingMutex
 - __tsan::Mutex

RWMutex and __tsan::Mutex are roughly the same,
except that tsan version supports deadlock detection.
BlockingMutex degrades better under heavy contention
from lots of threads (blocks in OS), but much slower
for light contention and has non-portable performance
and has larger static size and is not reader-writer.

Add a new mutex that combines all advantages of these
mutexes: it's reader-writer, has fast non-contended path,
supports blocking to gracefully degrade under higher contention,
has portable size/performance.

For now it's named Mutex2 for incremental submission. The plan is to:
 - land this change
 - then move deadlock detection logic from tsan
 - then rename it to Mutex and remove tsan Mutex
 - then typedef RWMutex/BlockingMutex to this mutex

SpinMutex stays as separate type because it has faster fast path:
1 atomic RMW per lock/unlock as compared to 2 for this mutex.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D106231
2021-07-20 08:19:57 +02:00
Petr Hosek ff427909ca [NFC][profile] Move writeMMappedFile to ELF ifdef block
This avoids the compiler warning on Darwin where that function is unused.
2021-07-19 23:13:13 -07:00
Dmitry Vyukov adb55d7c32 tsan: remove the stats subsystem
I don't think the stat subsystem was ever used since tsan
development in 2012. But it adds lots of code and this
effectively dead code needs to be updated if the runtime
code changes, which adds maintanance cost for no benefit.
Normal profiler usually gives enough info and that info
is more trustworthy.
Remove the stats subsystem.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D106276
2021-07-20 07:47:38 +02:00
Dmitry Vyukov d9b6e32dd7 tsan: add pragma line to buildgo.sh
Add pragma line so that errors messages point to the actual
source files rather than to the concatenated gotsan.cpp.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D106275
2021-07-20 07:22:01 +02:00
Lang Hames 8afa4c40cb [ORC-RT] Introduce a ORC_RT_JIT_DISPATCH_TAG macro.
This macro can be used to define tag variables for use with jit-dispatch.
2021-07-20 11:30:54 +10:00
Lang Hames ebec95590c [ORC-RT] Add ORC_RT prefix to WEAK_IMPORT macro. 2021-07-20 11:30:54 +10:00
Petr Hosek 54902e00d1 [InstrProfiling] Use weak alias for bias variable
We need the compiler generated variable to override the weak symbol of
the same name inside the profile runtime, but using LinkOnceODRLinkage
results in weak symbol being emitted in which case the symbol selected
by the linker is going to depend on the order of inputs which can be
fragile.

This change replaces the use of weak definition inside the runtime with
a weak alias. We place the compiler generated symbol inside a COMDAT
group so dead definition can be garbage collected by the linker.

We also disable the use of runtime counter relocation on Darwin since
Mach-O doesn't support weak external references, but Darwin already uses
a different continous mode that relies on overmapping so runtime counter
relocation isn't needed there.

Differential Revision: https://reviews.llvm.org/D105176
2021-07-19 12:23:51 -07:00
David Carlier 2d56e1394b [Sanitizer] Intercepts flopen/flopenat on FreeBSD.
Reviewers: vitalybuka

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D106218
2021-07-19 19:46:35 +01:00
Dmitry Vyukov 7f67263d56 tsan: remove duplicate arch switch in buildgo.sh
For some reason we have 2 switches on arch and add
half of arch flags in one place and half in another.
Merge these 2 switches.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D106274
2021-07-19 17:19:17 +02:00
Alexander Belyaev f58a1f65e7 [rt][nfc] Rewrite #ifndef as #if defined(). 2021-07-19 14:17:13 +02:00
Lang Hames aa69f0d8fb [ORC-RT] Introduce a weak-import macro.
This should eliminate warnings about ignored weak_import attributes on some of
the bots, e.g. https://lab.llvm.org/buildbot/#/builders/165/builds/3770/.
2021-07-19 22:10:51 +10:00
Lang Hames 11c11006d7 [ORC-RT] Separate jit-dispach tag decls from definitions.
This should eliminate the "initialized and declared 'extern'" warnings produced
on some bots, e.g. https://lab.llvm.org/buildbot/#/builders/165/builds/3770
2021-07-19 22:10:51 +10:00
Dmitry Vyukov baa7f58973 tsan: make obtaining current PC faster
We obtain the current PC is all interceptors and collectively
common interceptor code contributes to overall slowdown
(in particular cheaper str/mem* functions).

The current way to obtain the current PC involves:

  4493e1:       e8 3a f3 fe ff          callq  438720 <_ZN11__sanitizer10StackTrace12GetCurrentPcEv>
  4493e9:       48 89 c6                mov    %rax,%rsi

and the called function is:

uptr StackTrace::GetCurrentPc() {
  438720:       48 8b 04 24             mov    (%rsp),%rax
  438724:       c3                      retq

The new way uses address of a local label and involves just:

  44a888:       48 8d 35 fa ff ff ff    lea    -0x6(%rip),%rsi

I am not switching all uses of StackTrace::GetCurrentPc to GET_CURRENT_PC
because it may lead some differences in produced reports and break tests.
The difference comes from the fact that currently we have PC pointing
to the CALL instruction, but the new way does not yield any code on its own
so the PC points to a random instruction in the function and symbolizing
that instruction can produce additional inlined frames (if the random
instruction happen to relate to some inlined function).

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D106046
2021-07-19 13:05:30 +02:00
Lang Hames 94e0975450 [ORC] Drop 'const' for __orc_rt_CWrapperFunctionResultDataUnion::ValuePtr.
This member is now only used when storage is heap-allocated so it does not
need to be const. Dropping 'const' eliminates cast warnings on many builders.
2021-07-19 21:03:12 +10:00
Lang Hames ad4f04773c [ORC-RT] Fix missing std::move.
This should fix the 'could-not-covert' error at wrapper_function_utils.h:128 in
https://lab.llvm.org/buildbot/#/builders/112/builds/7748.
2021-07-19 21:03:12 +10:00
David Spickett 3d5c1a8173 [compiler-rt][GWP-ASAN] Disable 2 tests on Armv7 Linux
These have been failing on our bots for a while due to
incomplete backtraces. (you don't get the names of the
functions that did the access, just the reporter frames)

See:
https://lab.llvm.org/buildbot/#/builders/170/builds/180
2021-07-19 10:45:11 +00:00
Lang Hames eaa329e76e [ORC-RT] Handle missing __has_builtin operator.
For compilers that do not support __has_builtin just return '0'. This should fix
the bot failure at https://lab.llvm.org/buildbot/#/builders/165/builds/3761.
2021-07-19 20:20:31 +10:00
Lang Hames bb5f97e3ad [ORC][ORC-RT] Introduce ORC-runtime based MachO-Platform.
Adds support for MachO static initializers/deinitializers and eh-frame
registration via the ORC runtime.

This commit introduces cooperative support code into the ORC runtime and ORC
LLVM libraries (especially the MachOPlatform class) to support macho runtime
features for JIT'd code. This commit introduces support for static
initializers, static destructors (via cxa_atexit interposition), and eh-frame
registration. Near-future commits will add support for MachO native
thread-local variables, and language runtime registration (e.g. for Objective-C
and Swift).

The llvm-jitlink tool is updated to use the ORC runtime where available, and
regression tests for the new MachOPlatform support are added to compiler-rt.

Notable changes on the ORC runtime side:

1. The new macho_platform.h / macho_platform.cpp files contain the bulk of the
runtime-side support. This includes eh-frame registration; jit versions of
dlopen, dlsym, and dlclose; a cxa_atexit interpose to record static destructors,
and an '__orc_rt_macho_run_program' function that defines running a JIT'd MachO
program in terms of the jit- dlopen/dlsym/dlclose functions.

2. Replaces JITTargetAddress (and casting operations) with ExecutorAddress
(copied from LLVM) to improve type-safety of address management.

3. Adds serialization support for ExecutorAddress and unordered_map types to
the runtime-side Simple Packed Serialization code.

4. Adds orc-runtime regression tests to ensure that static initializers and
cxa-atexit interposes work as expected.

Notable changes on the LLVM side:

1. The MachOPlatform class is updated to:

  1.1. Load the ORC runtime into the ExecutionSession.
  1.2. Set up standard aliases for macho-specific runtime functions. E.g.
       ___cxa_atexit -> ___orc_rt_macho_cxa_atexit.
  1.3. Install the MachOPlatformPlugin to scrape LinkGraphs for information
       needed to support MachO features (e.g. eh-frames, mod-inits), and
       communicate this information to the runtime.
  1.4. Provide entry-points that the runtime can call to request initializers,
       perform symbol lookup, and request deinitialiers (the latter is
       implemented as an empty placeholder as macho object deinits are rarely
       used).
  1.5. Create a MachO header object for each JITDylib (defining the __mh_header
       and __dso_handle symbols).

2. The llvm-jitlink tool (and llvm-jitlink-executor) are updated to use the
runtime when available.

3. A `lookupInitSymbolsAsync` method is added to the Platform base class. This
can be used to issue an async lookup for initializer symbols. The existing
`lookupInitSymbols` method is retained (the GenericIRPlatform code is still
using it), but is deprecated and will be removed soon.

4. JIT-dispatch support code is added to ExecutorProcessControl.

The JIT-dispatch system allows handlers in the JIT process to be associated with
'tag' symbols in the executor, and allows the executor to make remote procedure
calls back to the JIT process (via __orc_rt_jit_dispatch) using those tags.

The primary use case is ORC runtime code that needs to call bakc to handlers in
orc::Platform subclasses. E.g. __orc_rt_macho_jit_dlopen calling back to
MachOPlatform::rt_getInitializers using __orc_rt_macho_get_initializers_tag.
(The system is generic however, and could be used by non-runtime code).

The new ExecutorProcessControl::JITDispatchInfo struct provides the address
(in the executor) of the jit-dispatch function and a jit-dispatch context
object, and implementations of the dispatch function are added to
SelfExecutorProcessControl and OrcRPCExecutorProcessControl.

5. OrcRPCTPCServer is updated to support JIT-dispatch calls over ORC-RPC.

6. Serialization support for StringMap is added to the LLVM-side Simple Packed
Serialization code.

7. A JITLink::allocateBuffer operation is introduced to allocate writable memory
attached to the graph. This is used by the MachO header synthesis code, and will
be generically useful for other clients who want to create new graph content
from scratch.
2021-07-19 19:50:16 +10:00
Lang Hames ac5ce40fa8 [ORC-RT] Fix signedness warning in unit test. 2021-07-19 19:44:42 +10:00
David Carlier 657eb94324 [Sanitizers] FutexWake fix typo for FreeBSD code path. 2021-07-18 07:02:21 +01:00
Martin Storsjö 1f1369e476 [sanitizers] Fix building on case sensitive mingw platforms
Make synchronization.lib all lowercase name for mingw, where casing matters.

This fixes building after 6d160abd7eba73031a2af500981f8ef44bd75ee4.
2021-07-17 09:34:16 +03:00
Emily Shi b316c30269 [NFC][compiler-rt][test] when using ptrauth, strip before checking if poisoned
ptrauth stores info in the address of functions, so it's not the right address we should check if poisoned

rdar://75246928

Differential Revision: https://reviews.llvm.org/D106199
2021-07-16 17:13:19 -07:00
Vitaly Buka c14f26846e [sanitizer] Fix test build on Windows 2021-07-16 15:28:51 -07:00
Emily Shi df1c3aaa17 [NFC][compiler-rt][test] pass through MallocNanoZone to iossim env
This is required for no-fd.cpp test

rdar://79354597

Differential Revision: https://reviews.llvm.org/D106174
2021-07-16 13:16:40 -07:00
Fangrui Song 8f806d5f52 [test] Avoid llvm-readelf/llvm-readobj one-dash long options 2021-07-16 12:03:07 -07:00
Dmitry Vyukov db29c030df sanitizer_common: link Synchronization.lib on Windows
Windows bot failed with:
sanitizer_win.cpp.obj : error LNK2019: unresolved external symbol WaitOnAddress referenced in function "void __cdecl __sanitizer::FutexWait(struct __sanitizer::atomic_uint32_t *,unsigned int)" (?FutexWait@__sanitizer@@YAXPEAUatomic_uint32_t@1@I@Z)
sanitizer_win.cpp.obj : error LNK2019: unresolved external symbol WakeByAddressSingle referenced in function "void __cdecl __sanitizer::FutexWake(struct __sanitizer::atomic_uint32_t *,unsigned int)" (?FutexWake@__sanitizer@@YAXPEAUatomic_uint32_t@1@I@Z)
sanitizer_win.cpp.obj : error LNK2019: unresolved external symbol WakeByAddressAll referenced in function "void __cdecl __sanitizer::FutexWake(struct __sanitizer::atomic_uint32_t *,unsigned int)" (?FutexWake@__sanitizer@@YAXPEAUatomic_uint32_t@1@I@Z)
https://lab.llvm.org/buildbot/#/builders/127/builds/14046

According to MSDN we need to link Synchronization.lib:
https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitonaddress

Differential Revision: https://reviews.llvm.org/D106167
2021-07-16 19:59:04 +02:00
Emily Shi cfa4d112da [compiler-rt] change write order of frexpl & frexpf so it doesn't corrupt stack ids
This was fixed in the past for `frexp`, but was not made for `frexpl` & `frexpf` https://github.com/google/sanitizers/issues/321
This patch copies the fix over to `frexpl` because it caused `frexp_interceptor.cpp` test to fail on iPhone and `frexpf` for consistency.

rdar://79652161

Reviewed By: delcypher, vitalybuka

Differential Revision: https://reviews.llvm.org/D104948
2021-07-16 10:58:12 -07:00
Dmitry Vyukov 6a4054ef06 sanitizer_common: add Semaphore
Semaphore is a portable way to park/unpark threads.
The plan is to use it to implement a portable blocking
mutex in subsequent changes. Semaphore can also be used
to efficiently wait for other things (e.g. we currently
spin to synchronize thread creation and start).

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D106071
2021-07-16 19:34:24 +02:00
Petr Hosek 25dade54d3 [profile] Decommit memory after counter relocation
After we relocate counters, we no longer need to keep the original copy
around so we can return the memory back to the operating system.

Differential Revision: https://reviews.llvm.org/D104839
2021-07-15 22:49:21 -07:00
Nico Weber 557855e047 Revert "tsan: make obtaining current PC faster"
This reverts commit e33446ea58.
Doesn't build on mac, and causes other problems. See reports
on https://reviews.llvm.org/D106046 and https://reviews.llvm.org/D106081

Also revert follow-up "tsan: strip top inlined internal frames"
This reverts commit 7b302fc9b0.
2021-07-15 19:29:19 -04:00
Dmitry Vyukov c3c324dddf tsan: lock ScopedErrorReportLock around fork
Currently we don't lock ScopedErrorReportLock around fork
and it mostly works becuase tsan has own report_mtx that
is locked around fork and tsan reports.
However, sanitizer_common code prints some own reports
which are not protected by tsan's report_mtx. So it's better
to lock ScopedErrorReportLock explicitly.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D106048
2021-07-15 21:00:11 +02:00
Dmitry Vyukov 7b302fc9b0 tsan: strip top inlined internal frames
The new GET_CURRENT_PC() can lead to spurious top inlined internal frames.
Here are 2 examples from bots, in both cases the malloc is supposed to be
the top frame (#0):

  WARNING: ThreadSanitizer: signal-unsafe call inside of a signal
    #0 __sanitizer::StackTrace::GetNextInstructionPc(unsigned long)
    #1 malloc

  Location is heap block of size 99 at 0xbe3800003800 allocated by thread T1:
    #0 __sanitizer::StackTrace::GetNextInstructionPc(unsigned long)
    #1 malloc

Let's strip these internal top frames from reports.
With other code changes I also observed some top frames
from __tsan::ScopedInterceptor, proactively remove these as well.

Differential Revision: https://reviews.llvm.org/D106081
2021-07-15 19:37:44 +02:00
Fangrui Song aa3df8ddcd [test] Avoid llvm-readelf/llvm-readobj one-dash long options and deprecated aliases (e.g. --file-headers) 2021-07-15 10:26:21 -07:00
Fangrui Song 7299c6f635 [test] Avoid llvm-nm one-dash long options 2021-07-15 09:50:36 -07:00
Dmitry Vyukov e33446ea58 tsan: make obtaining current PC faster
We obtain the current PC is all interceptors and collectively
common interceptor code contributes to overall slowdown
(in particular cheaper str/mem* functions).

The current way to obtain the current PC involves:

  4493e1:       e8 3a f3 fe ff          callq  438720 <_ZN11__sanitizer10StackTrace12GetCurrentPcEv>
  4493e9:       48 89 c6                mov    %rax,%rsi

and the called function is:

uptr StackTrace::GetCurrentPc() {
  438720:       48 8b 04 24             mov    (%rsp),%rax
  438724:       c3                      retq

The new way uses address of a local label and involves just:

  44a888:       48 8d 35 fa ff ff ff    lea    -0x6(%rip),%rsi

I am not switching all uses of StackTrace::GetCurrentPc to GET_CURRENT_PC
because it may lead some differences in produced reports and break tests.
The difference comes from the fact that currently we have PC pointing
to the CALL instruction, but the new way does not yield any code on its own
so the PC points to a random instruction in the function and symbolizing
that instruction can produce additional inlined frames (if the random
instruction happen to relate to some inlined function).

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D106046
2021-07-15 17:34:00 +02:00
Ilya Leoshkevich 9bf2e7eeeb [TSan] Add SystemZ SANITIZER_GO support
Define the address ranges (similar to the C/C++ ones, but with the heap
range merged into the app range) and enable the sanity check.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:48 +02:00
Ilya Leoshkevich e34078f121 [TSan] Enable SystemZ support
Enable building the runtime and enable -fsanitize=thread in clang.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:48 +02:00
Ilya Leoshkevich 937242cecc [TSan] Adjust tests for SystemZ
XFAIL map32bit, define the maximum possible allocation size in
mmap_large.cpp.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:48 +02:00
Ilya Leoshkevich bd77f742d6 [TSan] Intercept __tls_get_addr_internal and __tls_get_offset on SystemZ
Reuse the assembly glue code from sanitizer_common_interceptors.inc and
the handling logic from the __tls_get_addr interceptor.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:48 +02:00
Ilya Leoshkevich b17673816d [TSan] Disable __TSAN_HAS_INT128 on SystemZ
SystemZ does not have 128-bit atomics.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:48 +02:00
Ilya Leoshkevich 402fc790eb [TSan] Add SystemZ longjmp support
Implement the interceptor and stack pointer demangling.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:48 +02:00
Ilya Leoshkevich 96a29df0b1 [TSan] Define C/C++ address ranges for SystemZ
The kernel supports a full 64-bit VMA, but we can use only 48 bits due
to the limitation imposed by SyncVar::GetId(). So define the address
ranges similar to the other architectures, except that the address
space "tail" needs to be made inaccessible in CheckAndProtect(). Since
it's for only one architecture, don't make an abstraction for this.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:47 +02:00
Ilya Leoshkevich fab044045b [TSan] Define PTHREAD_ABI_BASE for SystemZ
SystemZ's glibc symbols use version 2.3.2.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:47 +02:00
Ilya Leoshkevich d5c34ee5b6 [TSan] Build ignore_lib{0,1,5} tests with -fno-builtin
These tests depend on TSan seeing the intercepted memcpy(), so they
break when the compiler chooses the builtin version.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:47 +02:00
Ilya Leoshkevich cadbb92416 [TSan] Align thread_registry_placeholder
s390x requires ThreadRegistry.mtx_.opaque_storage_ to be 4-byte
aligned. Since other architectures may have similar requirements, use
the maximum thread_registry_placeholder alignment from other
sanitizers, which is 64 (LSan).

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:47 +02:00
Ilya Leoshkevich 54128b73f8 [sanitizer] Force TLS allocation on s390
When running with an old glibc, CollectStaticTlsBlocks() calls
__tls_get_addr() in order to force TLS allocation. This function is not
available on s390 and the code simply does nothing in this case,
so all the resulting static TLS blocks end up being incorrect.

Fix by calling __tls_get_offset() on s390.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D105629
2021-07-15 12:18:47 +02:00