While on regular Linux system (Fedora 34 GA, not updated):
* thread #1, name = '1', stop reason = hit program assert
frame #0: 0x00007ffff7e242a2 libc.so.6`raise + 322
frame #1: 0x00007ffff7e0d8a4 libc.so.6`abort + 278
frame #2: 0x00007ffff7e0d789 libc.so.6`__assert_fail_base.cold + 15
frame #3: 0x00007ffff7e1ca16 libc.so.6`__assert_fail + 70
* frame #4: 0x00000000004011bd 1`main at assert.c:7:3
On Fedora 35 pre-release one gets:
* thread #1, name = '1', stop reason = signal SIGABRT
* frame #0: 0x00007ffff7e48ee3 libc.so.6`pthread_kill@GLIBC_2.2.5 + 67
frame #1: 0x00007ffff7dfb986 libc.so.6`raise + 22
frame #2: 0x00007ffff7de5806 libc.so.6`abort + 230
frame #3: 0x00007ffff7de571b libc.so.6`__assert_fail_base.cold + 15
frame #4: 0x00007ffff7df4646 libc.so.6`__assert_fail + 70
frame #5: 0x00000000004011bd 1`main at assert.c:7:3
I did not write a testcase as one needs the specific glibc. An
artificial test would just copy the changed source.
Reviewed By: mib
Differential Revision: https://reviews.llvm.org/D105133
Commit 090306fc80 (August 2020) changed most of the arm64 SVE_PT*
macros, but apparently did not make the changes in the
NativeRegisterContextLinux_arm64.* files (or those files were pulled
over from someplace else after that commit). This change replaces the
macros NativeRegisterContextLinux_arm64.cpp with the replacement
definitions in LinuxPTraceDefines_arm64sve.h. It also includes
LinuxPTraceDefines_arm64sve.h in NativeRegisterContextLinux_arm64.h.
Differential Revision: https://reviews.llvm.org/D104826
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D105160
This patch implements a slight improvement when debugging across
platforms and remapping source paths that are in a non-native
format. See the unit test for examples.
rdar://79205675
Differential Revision: https://reviews.llvm.org/D104407
NFC.
This patch replaces the function body FindFile() with a call to
RemapPath(), since the two functions implement the same functionality.
Differential Revision: https://reviews.llvm.org/D104406
This is an NFC modernization refactoring that replaces the combination
of a bool return + reference argument, with an Optional return value.
Differential Revision: https://reviews.llvm.org/D104405
Reverts commits:
"Fix failing tests after https://reviews.llvm.org/D104488."
"Fix buildbot failure after https://reviews.llvm.org/D104488."
"Create synthetic symbol names on demand to improve memory consumption and startup times."
This series of commits broke the windows lldb bot and then failed to fix all of the failing tests.
When we check whether the Objective-C SPI is available, we need to check
for the mangled symbol name. Unlike `objc_copyRealizedClassList`, which
is C exported, the `nolock` variant is not.
Differential revision: https://reviews.llvm.org/D105136
Previously, when `interpreter.save-session-on-quit` was enabled, lldb
would save the session transcript only when running the `quit` command.
This patch changes that so the transcripts are saved when the debugger
object is destroyed if the setting is enabled.
rdar://72902650
Differential Revision: https://reviews.llvm.org/D105038
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces a new interpreter setting
`interpreter.save-session-directory` so the user can specify a directory
where the session transcripts will be saved.
If not set, the session transcript are saved on a temporary file.
rdar://72902842
Differential Revision: https://reviews.llvm.org/D105030
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.
Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.
This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool
Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)
After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)
The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.
Differential Revision: https://reviews.llvm.org/D104488
When we run `xcrun` we don't have any user input in our command so relying on
the user's default shell doesn't make a lot of sense. If the user has set the
system shell to a something that isn't supported yet (dash, ash) then we would
run into the problem that we don't know how to escape our command string.
This patch just avoids using any shell at all as xcrun is always at the same
path.
Reviewed By: aprantl, JDevlieghere, kastiglione
Differential Revision: https://reviews.llvm.org/D104653
Avoid standing the Objective-C runtime lock by calling
objc_copyRealizedClassList_nolock instead of objc_copyRealizedClassList.
We already guarantee that no other threads can run while we're running
this utility expression, similar to when we parse the data ourselves
from the gdb_objc_realized_classes struct.
Worst case this will crash if the list is getting edited, which won't do
any harm and we'll just try again later.
Differential revision: https://reviews.llvm.org/D104951
This was an oversight of the commit: bb93483c11 that
added support for the Frozen variants. Also added a test case for the way that
currently produces one of these variants (a copy).
This is an NFC modernization refactoring that replaces the combination
of a bool return + reference argument, with an Optional return value.
Differential Revision: https://reviews.llvm.org/D104404
These were disabled in 473a3a773e
because they failed on 32 bit platforms. (Arm for sure but I assume
any 32 bit)
This was due to the printf formatter used. These assumed
that types like uint64_t/size_t would be certain size/type and
that changes on 32 bit.
Instead use "z" to print the size_t and PRI<...> formatters
for the addr_t (always uint64_t) and the int32_t.
This new command looks much like "memory read"
and mirrors its basic behaviour.
(lldb) memory tag read new_buf_ptr new_buf_ptr+32
Logical tag: 0x9
Allocation tags:
[0x900fffff7ffa000, 0x900fffff7ffa010): 0x9
[0x900fffff7ffa010, 0x900fffff7ffa020): 0x0
Important proprties:
* The end address is optional and defaults to reading
1 tag if ommitted
* It is an error to try to read tags if the architecture
or process doesn't support it, or if the range asked
for is not tagged.
* It is an error to read an inverted range (end < begin)
(logical tags are removed for this check so you can
pass tagged addresses here)
* The range will be expanded to fit the tagging granule,
so you can get more tags than simply (end-begin)/granule size.
Whatever you get back will always cover the original range.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97285
This adds GDB client support for the qMemTags packet
which reads memory tags. Following the design
which was recently committed to GDB.
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
(look for qMemTags)
lldb commands will use the new Process methods
GetMemoryTagManager and ReadMemoryTags.
The former takes a range and checks that:
* The current process architecture has an architecture plugin
* That plugin provides a MemoryTagManager
* That the range of memory requested lies in a tagged range
(it will expand it to granules for you)
If all that was true you get a MemoryTagManager you
can give to ReadMemoryTags.
This two step process is done to allow commands to get the
tag manager without having to read tags as well. For example
you might just want to remove a logical tag, or error early
if a range with tagged addresses is inverted.
Note that getting a MemoryTagManager doesn't mean that the process
or a specific memory range is tagged. Those are seperate checks.
Having a tag manager just means this architecture *could* have
a tagging feature enabled.
An architecture plugin has been added for AArch64 which
will return a MemoryTagManagerAArch64MTE, which was added in a
previous patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95602
This adds memory tag reading using the new "qMemTags"
packet and ptrace on AArch64 Linux.
This new packet is following the one used by GDB.
(https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html)
On AArch64 Linux we use ptrace's PEEKMTETAGS to read
tags and we assume that lldb has already checked that the
memory region actually has tagging enabled.
We do not assume that lldb has expanded the requested range
to granules and expand it again to be sure.
(although lldb will be sending aligned ranges because it happens
to need them client side anyway)
Also we don't assume untagged addresses. So for AArch64 we'll
remove the top byte before using them. (the top byte includes
MTE and other non address data)
To do the ptrace read NativeProcessLinux will ask the native
register context for a memory tag manager based on the
type in the packet. This also gives you the ptrace numbers you need.
(it's called a register context but it also has non register data,
so it saves adding another per platform sub class)
The only supported platform for this is AArch64 Linux and the only
supported tag type is MTE allocation tags. Anything else will
error.
Ptrace can return a partial result but for lldb-server we will
be treating that as an error. To succeed we need to get all the tags
we expect.
(Note that the protocol leaves room for logical tags to be
read via qMemTags but this is not going to be implemented for lldb
at this time.)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95601
This feature "memory-tagging+" indicates that lldb-server
supports memory tagging packets. (added in a later patch)
We check HWCAP2_MTE to decide whether to enable this
feature for Linux.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97282
This adds the MemoryTagManager class and a specialisation
of that class for AArch64 MTE tags. It provides a generic
interface for various tagging operations.
Adding/removing tags, diffing tagged pointers, etc.
Later patches will use this manager to handle memory tags
in generic code in both lldb and lldb-server.
Since it will be used in both, the base class header is in
lldb/Target.
(MemoryRegionInfo is another example of this pattern)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97281
As a follow up of D103588, I'm reinitiating the discussion with a new proposal for traversing instructions in a trace which uses the feedback gotten in that diff.
See the embedded documentation in TraceCursor for more information. The idea is to offer an OOP way to traverse instructions exposing a minimal interface that makes no assumptions on:
- the number of instructions in the trace (i.e. having indices for instructions might be impractical for gigantic intel-pt traces, as it would require to decode the entire trace). This renders the use of indices to point to instructions impractical. Traces are big and expensive, and the consumer should try to do look linear lookups (forwards and/or backwards) and avoid random accesses (the API could be extended though, but for now I want to dicard that funcionality and leave the API extensible if needed).
- the way the instructions are represented internally by each Trace plug-in. They could be mmap'ed from a file, exist in plain vector or generated on the fly as the user requests the data.
- the actual data structure used internally for each plug-in. Ideas like having a struct TraceInstruction have been discarded because that would make the plug-in follow a certain data type, which might be costly. Instead, the user can ask the cursor for each independent property of the instruction it's pointing at.
The way to get a cursor is to ask Trace.h for the end or being cursor or a thread's trace.
There are some benefits of this approach:
- there's little cost to create a cursor, and this allows for lazily decoding a trace as the user requests data.
- each trace plug-in could decide how to cache the instructions it generates. For example, if a trace is small, it might decide to keep everything in memory, or if the trace is massive, it might decide to keep around the last thousands of instructions to speed up local searches.
- a cursor can outlive a stop point, which makes trace comparison for live processes feasible. An application of this is to compare profiling data of two runs of the same function, which should be doable with intel pt.
Differential Revision: https://reviews.llvm.org/D104422
We can extend/modify `GetMethodNameVariants` to suit our purposes here.
What symtab is looking for is alternate names we may want to use to
search for a specific symbol, and asking for variants of a name makes
the most sense here.
Differential Revision: https://reviews.llvm.org/D104067
I added asserts to these in https://reviews.llvm.org/D104525.
They are available (directly or otherwise) via the API so we
should not assert.
Restore the previous behaviour. If the message
is empty, we return early before printing anything.
For SetError don't assert that the error is a failure.
The remaining assert is in AppendRawError which
is not part of the API.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104778
Replacing existing uses with AppendError.
SetError is also part of the SBI API. This remains
but instead of calling the underlying SetError it
will call AppendError.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104768
Mostly by converting uses of GetErrorStream to AppendError,
so that the call to SetStatus is implicit.
Some remain where it isn't certain that you'll have a message
to set, or you want the output to be on stdout.
One place in CommandObjectWatchpoint previously didn't set
the status to failed at all. However it's pretty obvious
that it should do so.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104697
We should *never* use static local variables in this file as this makes
unittesting the plugin code impossible (and this whole 'testing' thing has
turned out to be rather useful so far).
LLDB supports having globbing regexes in the process launch arguments that will
be resolved using the user's shell. This requires that we pass the launch args
to the shell and then read back the expanded arguments using LLDB's argdumper
utility.
As the shell will not just expand the globbing regexes but all special
characters, we need to escape all non-globbing charcters such as $, &, <, >,
etc. as those otherwise are interpreted and removed in the step where we expand
the globbing characters. Also because the special characters are shell-specific,
LLDB needs to maintain a list of all the characters that need to be escaped for
each specific shell.
This patch adds the list of special characters that need to be escaped for
`zsh`. Without this patch on systems where `zsh` is the user's shell (like on
all macOS systems) having any of these special characters in your arguments or
path to the binary will cause the process launch to fail. E.g., `lldb -- ./calc
1<2` is failing without this patch. The same happens if the absolute path to
`calc` is in a directory that contains for example parentheses or other special
characters.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D104627
Rust's v0 name mangling scheme [1] is easy to disambiguate from other
name mangling schemes because symbols always start with `_R`. The llvm
Demangle library supports demangling the Rust v0 scheme. Use it to
demangle Rust symbols.
Added unit tests that check simple symbols. Ran LLDB built with this
patch to debug some Rust programs compiled with the v0 name mangling
scheme. Confirmed symbol names were demangled as expected.
Note: enabling the new name mangling scheme requires a nightly
toolchain:
```
$ cat main.rs
fn main() {
println!("Hello world!");
}
$ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g
$ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2'
(lldb) target create "./main"
Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64).
(lldb) b main.rs:2
Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4
(lldb) r
Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64)
warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section
Process 948449 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5
1 fn main() {
-> 2 println!("Hello world!");
3 }
(lldb) bt
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
* thread #1, name = 'main', stop reason = breakpoint 1.1
* frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5
frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5
frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18
frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18
frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13
frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40
frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19
frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14
frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21
frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5
frame #10: 0x000055555555b6fc main`main + 28
frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243
frame #12: 0x000055555555b59e main`_start + 46
(lldb)
```
[1]: https://github.com/rust-lang/rust/issues/60705
Reviewed By: clayborg, teemperor
Differential Revision: https://reviews.llvm.org/D104054
The intention is now that AppendError/SetError/AppendRawError only
be called with some message to show. This enforces that.
For SetError with a Status and a fallback string first assert
that the Status is a failure Status. Then it calls SetError(StringRef)
which checks the message is valid. (which could be the fallback
or the Status')
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104525
Add a new feature to process save-core on Darwin systems -- for
lldb to create a user process corefile with only the dirty (modified
memory) pages included. All of the binaries that were used in the
corefile are assumed to still exist on the system for the duration
of the use of the corefile. A new --style option to process save-core
is added, so a full corefile can be requested if portability across
systems, or across time, is needed for this corefile.
debugserver can now identify the dirty pages in a memory region
when queried with qMemoryRegionInfo, and the size of vm pages is
given in qHostInfo.
Create a new "all image infos" LC_NOTE for Mach-O which allows us
to describe all of the binaries that were loaded in the process --
load address, UUID, file path, segment load addresses, and optionally
whether code from the binary was executing on any thread. The old
"read dyld_all_image_infos and then the in-memory Mach-O load
commands to get segment load addresses" no longer works when we
only have dirty memory.
rdar://69670807
Differential Revision: https://reviews.llvm.org/D88387
Add support for extracting basic data from NetBSD/i386 core dumps.
FPU registers are not supported at the moment.
Differential Revision: https://reviews.llvm.org/D101091
This adds a basic SB API for creating and stopping traces.
Note: This doesn't add any APIs for inspecting individual instructions. That'd be a more complicated change and it might be better to enhande the dump functionality to output the data in binary format. I'll leave that for a later diff.
This also enhances the existing tests so that they test the same flow using both the command interface and the SB API.
I also did some cleanup of legacy code.
Differential Revision: https://reviews.llvm.org/D103500
This is part 2, covering the commands source.
Some uses remain where it's tricky to see what the
logic is or they are not used with AppendError.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104448
Since https://reviews.llvm.org/D103701 AppendError<...>
sets this for you.
This change includes all of the non-command uses.
Some uses remain where it's either tricky to reason about
the logic, or they aren't paired with AppendError calls.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104379
The idea is now that AppendError<...> will set eReturnStatusFailed
for you so you don't have to call SetStatus again.
Previously if the error message was empty, the status wouldn't
be set.
I don't think there are any sitautions where the message is in
fact empty but it potentially could be depending on where
we get the string from.
So let's set the status up front then return early if the message is empty.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104380
when dealing with -gmodules debug info.
This fixes the bot failures on Darwin.
A recent clang change (presumably https://reviews.llvm.org/D104291)
introduced a bug where .pcm files would identify themselves as
DW_LANG_C_plus_plus, but the .o that references them would identify as
DW_LANG_C_plus_plus_14. While that bug needs to be fixed, too, it
shows that the current strict comparison also isn't meaningful.
rdar://79423225
`vwprintw` is (in theory) using the `arargs.h` va_list while `vw_printw` is
using the `stdarg.h` va_list. It seems these days they can be used
interchangeably but `vwprintw` is marked as deprecated.