Multithreaded applications using fork(2) need to be extra careful about
what they do in the fork child. Without any special precautions (which
only really work if you can fully control all threads) they can only
safely call async-signal-safe functions. This is because the forked
child will contain snapshot of the parents memory at a random moment in
the execution of all of the non-forking threads (this is where the
similarity with signals comes in).
For example, the other threads could have been holding locks that can
now never be released in the child process and any attempt to obtain
them would block. This is what sometimes happen when using tcmalloc --
our fork child ends up hanging in the memory allocation routine. It is
also what happened with our logging code, which is why we added a
pthread_atfork hackaround.
This patch implements a proper fix to the problem, by which is to make
the child code async-signal-safe. The ProcessLaunchInfo structure is
transformed into a simpler ForkLaunchInfo representation, one which can
be read without allocating memory and invoking complex library
functions.
Strictly speaking this implementation is not async-signal-safe, as it
still invokes library functions outside of the posix-blessed set of
entry points. Strictly adhering to the spec would mean reimplementing a
lot of the functionality in pure C, so instead I rely on the fact that
any reasonable implementation of some functions (e.g.,
basic_string::c_str()) will not start allocating memory or doing other
unsafe things.
The new child code does not call into our logging infrastructure, which
enables us to remove the pthread_atfork call from there.
Differential Revision: https://reviews.llvm.org/D116165
This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes:
- We no longer modify modification times of the cache files
- Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp)
- Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed
- Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control
This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented.
The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items:
- object file UUID if available
- object file mod time if available
- object name for BSD archive .o files that are in .a files if available
If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached.
When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created.
Module caching must be enabled by the user before this can be used:
symbols.enable-lldb-index-cache (boolean) = false
(lldb) settings set symbols.enable-lldb-index-cache true
There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory:
(lldb) settings show symbols.lldb-index-cache-path
/var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache
If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug.
Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues.
If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed.
The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings:
(lldb) settings set symbols.lldb-index-cache-expiration-days <days>
A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently.
(lldb) settings set symbols.lldb-index-cache-max-byte-size <size>
A value of zero will disable pruning based on a total byte size. The default value is zero currently.
(lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space>
A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero.
Reviewed By: labath, wallace
Differential Revision: https://reviews.llvm.org/D115324
The change to ArchSpec::SetArchitecture that was setting the
ObjectFile of a mach-o binary to llvm::Triple::MachO. It's not
necessary for my patch, and it changes the output of image list -t
causing TestUniversal.py to fail on x86_64 systems. The bots
turned up the failure, I was developing and testing this on
an Apple Silicon mac.
With arm64e ARMv8.3 pointer authentication, lldb needs to know how
many bits are used for addressing and how many are used for pointer
auth signing. This should be determined dynamically from the inferior
system / corefile, but there are some workflows where it still isn't
recorded and we fall back on a default value that is correct on some
Darwin environments.
This patch also explicitly sets the vendor of mach-o binaries to
Apple, so we select an Apple ABI instead of a random other ABI.
It adds a function pointer formatter for systems where pointer
authentication is in use, and we can strip the ptrauth bits off
of the function pointer address and get a different value that
points to an actual symbol.
Differential Revision: https://reviews.llvm.org/D115431
rdar://84644661
DataEncoder was previously made to modify data within an existing buffer. As the code progressed, new clients started using DataEncoder to create binary data. In these cases the use of this class was possibly, but only if you knew exactly how large your buffer would be ahead of time. This patchs adds the ability for DataEncoder to own a buffer that can be dynamically resized as data is appended to the buffer.
Change in this patch:
- Allow a DataEncoder object to be created that owns a DataBufferHeap object that can dynamically grow as data is appended
- Add new methods that start with "Append" to append data to the buffer and grow it as needed
- Adds full testing of the API to assure modifications don't regress any functionality
- Has two constructors: one that uses caller owned data and one that creates an object with object owned data
- "Append" methods only work if the object owns it own data
- Removes the ability to specify a shared memory buffer as no one was using this functionality. This allows us to switch to a case where the object owns its own data in a DataBufferHeap that can be resized as data is added
"Put" methods work on both caller and object owned data.
"Append" methods work on only object owned data where we can grow the buffer. These methods will return false if called on a DataEncoder object that has caller owned data.
The main reason for these modifications is to be able to use the DateEncoder objects instead of llvm::gsym::FileWriter in https://reviews.llvm.org/D113789. This patch wants to add the ability to create symbol table caching to LLDB and the code needs to build binary caches and save them to disk.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D115073
Since a8b54834a1, there are two
distinct Windows path styles, `windows_backslash` (with the old
`windows` being an alias for it) and `windows_slash`.
4e4883e1f3 added helpers for
inspecting path styles.
The newly added windows_slash path style doesn't end up used in
LLDB yet anyway, as LLDB is quite decoupled from most of
llvm::sys::path and uses its own FileSpec class. To take it in
use, it could be hooked up in `FileSpec::Style::GetNativeStyle`
(in lldb/source/Utility/FileSpec.cpp) just like in the `real_style`
function in llvm/lib/Support/Path.cpp in
df0ba47c36.
It is not currently clear whether there's a real need for using
the Windows path style with forward slashes in LLDB (if there's any
other applications interacting with it, expecting that style), and
what other changes in LLDB are needed for that to work, but this
at least makes some of the checks more ready for the new style,
simplifying code a bit.
Differential Revision: https://reviews.llvm.org/D113255
Use llvm::Optional<uint16_t> instead of int for port number
in UriParser::Parse(), and use llvm::None to indicate missing port
instead of a magic value of -1.
Differential Revision: https://reviews.llvm.org/D112309
Remove Status::WasInterrupted() that checks whether the underlying error
code matches EINTR. ProcessGDBRemote::ConnectToDebugserver() is its
only call site, and it does not seem correct there. After all, EINTR
is precisely when we want to retry, not stop retrying. Furthermore,
it should not really matter since we should be catching EINTR
immediately via llvm::sys::RetryAfterSignal() but that's another story.
Differential Revision: https://reviews.llvm.org/D111908
This makes the compiler generated code for accessing the thread local
variable much simpler (no need for wrapper functions and weak pointers
to potential init functions), and can avoid toolchain bugs regarding how
to access TLS variables.
In particular, this fixes LLDB when built with current GCC/binutils for
MinGW, see https://github.com/msys2/MINGW-packages/issues/8868.
Differential Revision: https://reviews.llvm.org/D111779
This reverts commits f9aba9a5af and
035217ff51.
As explained in the original commit message, this didn't have the
intended effect of improving the common LLDB use case, but still
provided a marginal improvement for the places where LLDB creates a
scoped time with a string literal.
The reason for the revert is that this change pulls in the os/signpost.h
header in Signposts.h. The former transitively includes loader.h, which
contains a series of macro defines that conflict with MachO.h. There are
ways to work around that, but Adrian and I concluded that none of them
are worth the trade-off in complicating Signposts.h even further.
Implement the simpler vRun packet and prefer it over the A packet.
Unlike the latter, it tranmits command-line arguments without redundant
indices and lengths. This also improves GDB compatibility since modern
versions of gdbserver do not implement the A packet at all.
Make qLaunchSuccess not obligatory when using vRun. It is not
implemented by gdbserver, and since vRun returns the stop reason,
we can assume it to be successful.
Differential Revision: https://reviews.llvm.org/D107931
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`. This achieves two things:
1) This starts standardizing predicates across the LLVM codebase,
following (in this case) ConstantInt. The word "Value" doesn't
convey anything of merit, and is missing in some of the other things.
2) Calling an integer "null" doesn't make any sense. The original sin
here is mine and I've regretted it for years. This moves us to calling
it "zero" instead, which is correct!
APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go. As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.
Included in this patch are changes to a bunch of the codebase, but there are
more. We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.
Differential Revision: https://reviews.llvm.org/D109483
Add a new SaveCore() process method that can be used to request a core
dump. This is currently implemented on NetBSD via the PT_DUMPCORE
ptrace(2) request, and enabled via 'savecore' extension.
Protocol-wise, a new qSaveCore packet is introduced. It accepts zero
or more semicolon-separated key:value options, invokes the core dump
and returns a key:value response. Currently the only option supported
is "path-hint", and the return value contains the "path" actually used.
The support for the feature is exposed via qSaveCore qSupported feature.
Differential Revision: https://reviews.llvm.org/D101285
This is implemented using the QMemTags packet, as specified
by GDB in:
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
(recall that qMemTags was previously added to read tags)
On receipt of a valid packet lldb-server will:
* align the given address and length to granules
(most of the time lldb will have already done this
but the specification doesn't guarantee it)
* Repeat the supplied tags as many times as needed to cover
the range. (if tags > range we just use as many as needed)
* Call ptrace POKEMTETAGS to write the tags.
The ptrace step will loop just like the tag read does,
until all tags are written or we get an error.
Meaning that if ptrace succeeds it could be a partial write.
So we call it again and if we then get an error, return an error to
lldb.
We are not going to attempt to restore tags after a partial
write followed by an error. This matches the behaviour of the
existing memory writes.
The lldb-server tests have been extended to include read and
write in the same test file. With some updated function names
since "qMemTags" vs "QMemTags" isn't very clear when they're
next to each other.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D105180
This adds memory tag reading using the new "qMemTags"
packet and ptrace on AArch64 Linux.
This new packet is following the one used by GDB.
(https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html)
On AArch64 Linux we use ptrace's PEEKMTETAGS to read
tags and we assume that lldb has already checked that the
memory region actually has tagging enabled.
We do not assume that lldb has expanded the requested range
to granules and expand it again to be sure.
(although lldb will be sending aligned ranges because it happens
to need them client side anyway)
Also we don't assume untagged addresses. So for AArch64 we'll
remove the top byte before using them. (the top byte includes
MTE and other non address data)
To do the ptrace read NativeProcessLinux will ask the native
register context for a memory tag manager based on the
type in the packet. This also gives you the ptrace numbers you need.
(it's called a register context but it also has non register data,
so it saves adding another per platform sub class)
The only supported platform for this is AArch64 Linux and the only
supported tag type is MTE allocation tags. Anything else will
error.
Ptrace can return a partial result but for lldb-server we will
be treating that as an error. To succeed we need to get all the tags
we expect.
(Note that the protocol leaves room for logical tags to be
read via qMemTags but this is not going to be implemented for lldb
at this time.)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95601
LLDB supports having globbing regexes in the process launch arguments that will
be resolved using the user's shell. This requires that we pass the launch args
to the shell and then read back the expanded arguments using LLDB's argdumper
utility.
As the shell will not just expand the globbing regexes but all special
characters, we need to escape all non-globbing charcters such as $, &, <, >,
etc. as those otherwise are interpreted and removed in the step where we expand
the globbing characters. Also because the special characters are shell-specific,
LLDB needs to maintain a list of all the characters that need to be escaped for
each specific shell.
This patch adds the list of special characters that need to be escaped for
`zsh`. Without this patch on systems where `zsh` is the user's shell (like on
all macOS systems) having any of these special characters in your arguments or
path to the binary will cause the process launch to fail. E.g., `lldb -- ./calc
1<2` is failing without this patch. The same happens if the absolute path to
`calc` is in a directory that contains for example parentheses or other special
characters.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D104627
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
This reapplies the previously reverted patch with additional include
order fixes for non-modular builds of LLDB.
Differential Revision: https://reviews.llvm.org/D103575
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
This reapplies the previsously reverted patch with additional MachO.h
macro #undefs.
Differential Revision: https://reviews.llvm.org/D103575
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
This reapplies the previsously reverted patch with support for
platforms where signposts are unavailable.
Differential Revision: https://reviews.llvm.org/D103575
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.
This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.
Differential Revision: https://reviews.llvm.org/D103575
This converts a default constructor's member initializers into C++11
default member initializers. This patch was automatically generated with
clang-tidy and the modernize-use-default-member-init check.
$ run-clang-tidy.py -header-filter='lldb' -checks='-*,modernize-use-default-member-init' -fix
This is a mass-refactoring patch and this commit will be added to
.git-blame-ignore-revs.
Differential revision: https://reviews.llvm.org/D103483
The C headers are deprecated so as requested in D102845, this is replacing them
all with their (not deprecated) C++ equivalent.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D103084
Add a NativeDelegate API to pass new processes (forks) to LLGS,
and support detaching them via the 'D' packet. A 'D' packet without
a specific PID detaches all processes, otherwise it detaches either
the specified subprocess or the main process, depending on the passed
PID.
Differential Revision: https://reviews.llvm.org/D100191
The armv6m entry in cores_match() got separated from its
friends armv7m and armv7em. Reuniting them to make it
easier to keep them updated in all at the same time.
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.
Solution:
This patch adds two new flags
- OF_CRLF which indicates that CRLF translation is used.
- OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.
Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.
So this is the behaviour per platform with my patch:
z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode
Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return
The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
if (Flags & OF_CRLF)
CrtOpenFlags |= _O_TEXT;
```
These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99426
Remove the LLDB_CAPTURE_REPRODUCER as it is inherently dangerous. The
reproducers require careful initialization which cannot be guaranteed by
overwriting the reproducer mode at this level.
If we want to provide this functionality, we should do it in the driver
instead. It was originally added to enable capture in CI, but we now
have a dedicated CI job that captures and replays the test suite.
This implements the interactive trace start and stop methods.
This diff ended up being much larger than I anticipated because, by doing it, I found that I had implemented in the beginning many things in a non optimal way. In any case, the code is much better now.
There's a lot of boilerplate code due to the gdb-remote protocol, but the main changes are:
- New tracing packets: jLLDBTraceStop, jLLDBTraceStart, jLLDBTraceGetBinaryData. The gdb-remote packet definitions are quite comprehensive.
- Implementation of the "process trace start|stop" and "thread trace start|stop" commands.
- Implementaiton of an API in Trace.h to interact with live traces.
- Created an IntelPTDecoder for live threads, that use the debugger's stop id as checkpoint for its internal cache.
- Added a functionality to stop the process in case "process tracing" is enabled and a new thread can't traced.
- Added tests
I have some ideas to unify the code paths for post mortem and live threads, but I'll do that in another diff.
Differential Revision: https://reviews.llvm.org/D91679
Add a minimal support for the multiprocess extension in lldb-server.
The server indicates support for it via qSupported, and accepts
thread-ids containing a PID. However, it still does not support
debugging more than one inferior, so any other PID value results
in an error.
Differential Revision: https://reviews.llvm.org/D98482
Call `os_log_fault` when an lldb assert fails. We piggyback off
`LLVM_SUPPORT_XCODE_SIGNPOSTS`, which also depends on `os_log`, to avoid
having to introduce another CMake check and corresponding define.
This patch also adds a small test using lldb-test that verifies we abort
with a "regular" assertion when asserts are enabled.
Differential revision: https://reviews.llvm.org/D98987