This mainly affects Darwin targets (macOS, iOS, tvOS and watchOS) when these targets don't use dSYM files and the debug info was in the .o files. All modules, including the .o files that are loaded by the debug maps, were in the global module list. This was great because it allows us to see each .o file and how much it contributes. There were virtual functions on the SymbolFile class to fetch the symtab/debug info parse and index times, and also the total debug info size. So the main executable would add all of the .o file's stats together and report them as its own data. Then the "totalDebugInfoSize" and many other "totalXXX" top level totals were all being added together. This stems from the fact that my original patch only emitted the modules for a target at the start of the patch, but as comments from the reviews came in, we switched to emitting all of the modules from the global module list.
So this patch fixes it so when we have a SymbolFileDWARFDebugMap that loads .o files, the main executable will have no debug info size or symtab/debug info parse/index times, but each .o file will have its own data as a separate module. Also, to be able to tell when/if we have a dSYM file I have added a "symbolFilePath" if the SymbolFile for the main modules path doesn't match that of the main executable. We also include a "symbolFileModuleIdentifiers" key in each module if the module does have multiple lldb_private::Module objects that contain debug info so that you can track down the information for a module and add up the contributions of all of the .o files.
Tests were added that are labeled with @skipUnlessDarwin and @no_debug_info_test that test all of this functionality so it doesn't regress.
For a module with a dSYM file, we can see the "symbolFilePath" is included:
```
"modules": [
{
"debugInfoByteSize": 1070,
"debugInfoIndexLoadedFromCache": false,
"debugInfoIndexSavedToCache": false,
"debugInfoIndexTime": 0,
"debugInfoParseTime": 0,
"identifier": 4873280600,
"path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_dsym_binary_has_symfile_in_stats/a.out",
"symbolFilePath": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_dsym_binary_has_symfile_in_stats/a.out.dSYM/Contents/Resources/DWARF/a.out",
"symbolTableIndexTime": 7.9999999999999996e-06,
"symbolTableLoadedFromCache": false,
"symbolTableParseTime": 7.8999999999999996e-05,
"symbolTableSavedToCache": false,
"triple": "arm64-apple-macosx12.0.0",
"uuid": "E1F7D85B-3A42-321E-BF0D-29B103F5F2E3"
},
```
And for the DWARF in .o file case we can see the "symbolFileModuleIdentifiers" in the executable's module stats:
```
"modules": [
{
"debugInfoByteSize": 0,
"debugInfoIndexLoadedFromCache": false,
"debugInfoIndexSavedToCache": false,
"debugInfoIndexTime": 0,
"debugInfoParseTime": 0,
"identifier": 4603526968,
"path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_no_dsym_binary_has_symfile_identifiers_in_stats/a.out",
"symbolFileModuleIdentifiers": [
4604429832
],
"symbolTableIndexTime": 7.9999999999999996e-06,
"symbolTableLoadedFromCache": false,
"symbolTableParseTime": 0.000112,
"symbolTableSavedToCache": false,
"triple": "arm64-apple-macosx12.0.0",
"uuid": "57008BF5-A726-3DE9-B1BF-3A9AD3EE8569"
},
```
And the .o file for 4604429832 looks like:
```
{
"debugInfoByteSize": 1028,
"debugInfoIndexLoadedFromCache": false,
"debugInfoIndexSavedToCache": false,
"debugInfoIndexTime": 0,
"debugInfoParseTime": 6.0999999999999999e-05,
"identifier": 4604429832,
"path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_no_dsym_binary_has_symfile_identifiers_in_stats/main.o",
"symbolTableIndexTime": 0,
"symbolTableLoadedFromCache": false,
"symbolTableParseTime": 0,
"symbolTableSavedToCache": false,
"triple": "arm64-apple-macosx"
}
```
Differential Revision: https://reviews.llvm.org/D119400
The symbol table needs to demangle all symbol names when building its
index. However, this doesn't require the full mangled name: we only need
the base name and the function declaration context. Currently, we always
construct the demangled string during indexing and cache it in the
string pool as a way to speed up future lookups.
Constructing the demangled string is by far the most expensive step of
the demangling process, because the output string can be exponentially
larger than the input and unless you're dumping the symbol table, many
of those demangled names will not be needed again.
This patch avoids constructing the full demangled string when we can
partially demangle. This speeds up indexing and reduces memory usage.
I gathered some numbers by attaching to Slack:
Before
------
Memory usage: 280MB
Benchmark 1: ./bin/lldb -n Slack -o quit
Time (mean ± σ): 4.829 s ± 0.518 s [User: 4.012 s, System: 0.208 s]
Range (min … max): 4.624 s … 6.294 s 10 runs
After
-----
Memory usage: 189MB
Benchmark 1: ./bin/lldb -n Slack -o quit
Time (mean ± σ): 4.182 s ± 0.025 s [User: 3.536 s, System: 0.192 s]
Range (min … max): 4.152 s … 4.233 s 10 runs
Differential revision: https://reviews.llvm.org/D118814
Have the different ::Parse.* methods return the demangled string
directly instead of having to go through ::GetBufferRef.
Differential revision: https://reviews.llvm.org/D118953
Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.
After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.
This adds an option --show-tags to "memory read".
(lldb) memory read mte_buf mte_buf+32 -f "x" -s8 --show-tags
0x900fffff7ff8000: 0x0000000000000000 0x0000000000000000 (tag: 0x0)
0x900fffff7ff8010: 0x0000000000000000 0x0000000000000000 (tag: 0x1)
Tags are printed on the end of each line, if that
line has any tags associated with it. Meaning that
untagged memory output is unchanged.
Tags are printed based on the granule(s) of memory that
a line covers. So you may have lines with 1 tag, with many
tags, no tags or partially tagged lines.
In the case of partially tagged lines, untagged granules
will show "<no tag>" so that the ordering is obvious.
For example, a line that covers 2 granules where the first
is not tagged:
(lldb) memory read mte_buf-16 mte_buf+16 -l32 -f"x" --show-tags
0x900fffff7ff7ff0: 0x00000000 <...> (tags: <no tag> 0x0)
Untagged lines will just not have the "(tags: ..." at all.
Though they may be part of a larger output that does have
some tagged lines.
To do this I've extended DumpDataExtractor to also print
memory tags where it has a valid execution context and
is asked to print them.
There are no special alignment requirements, simply
use "memory read" as usual. All alignment is handled
in DumpDataExtractor.
We use MakeTaggedRanges to find all the tagged memory
in the current dump, then read all that into a MemoryTagMap.
The tag map is populated once in DumpDataExtractor and re-used
for each subsequently printed line (or recursive call of
DumpDataExtractor, which some formats do).
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D107140
Until the introduction of the C++ REPL, there was always a single REPL
language. Several places relied on this assumption through
repl_languages.GetSingularLanguage. Now that this is no longer the case,
we need a way to specify a selected/preferred REPL language. This patch
does that with the help of a debugger property, taking inspiration from
how we store the scripting language.
Differential revision: https://reviews.llvm.org/D116697
Remove the Mangled::operator! and Mangled::operator void* where the
comments in header and implementation files disagree and replace them
with operator bool.
This fix PR52702 as https://reviews.llvm.org/D106837 used the buggy
Mangled::operator! in Symbol::SynthesizeNameIfNeeded. For example,
consider the symbol "puts" in a hello world C program:
// Inside Symbol::SynthesizeNameIfNeeded
(lldb) p m_mangled
(lldb_private::Mangled) $0 = (m_mangled = None, m_demangled = "puts")
(lldb) p !m_mangled
(bool) $1 = true # should be false!!
This leads to Symbol::SynthesizeNameIfNeeded overwriting m_demangled
part of Mangled (in this case "puts").
In conclusion, this patch turns
callq 0x401030 ; symbol stub for: ___lldb_unnamed_symbol36
back into
callq 0x401030 ; symbol stub for: puts .
Differential Revision: https://reviews.llvm.org/D116217
This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes:
- We no longer modify modification times of the cache files
- Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp)
- Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed
- Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control
This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented.
The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items:
- object file UUID if available
- object file mod time if available
- object name for BSD archive .o files that are in .a files if available
If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached.
When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created.
Module caching must be enabled by the user before this can be used:
symbols.enable-lldb-index-cache (boolean) = false
(lldb) settings set symbols.enable-lldb-index-cache true
There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory:
(lldb) settings show symbols.lldb-index-cache-path
/var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache
If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug.
Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues.
If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed.
The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings:
(lldb) settings set symbols.lldb-index-cache-expiration-days <days>
A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently.
(lldb) settings set symbols.lldb-index-cache-max-byte-size <size>
A value of zero will disable pruning based on a total byte size. The default value is zero currently.
(lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space>
A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero.
Reviewed By: labath, wallace
Differential Revision: https://reviews.llvm.org/D115324
Currently, we'll try to instantiate a ClangREPL for every known
language. The plugin manager already knows what languages it supports,
so rely on that to only instantiate a REPL when we know the requested
language is supported.
rdar://86439474
Differential revision: https://reviews.llvm.org/D115698
While profiling lldb (from swift/llvm-project), these timers were noticed to be short lived and high firing, and so they add noise more than value.
The data points I recorded are:
`FindTypes_Impl`: 49,646 calls, 812ns avg, 40.33ms total
`AppendSymbolIndexesWithName`: 36,229 calls, 913ns avg, 33.09ms total
`FindAllSymbolsWithNameAndType`: 36,229 calls, 1.93µs avg, 70.05ms total
`FindSymbolsWithNameAndType`: 23,263 calls, 3.09µs avg, 71.88ms total
Differential Revision: https://reviews.llvm.org/D115182
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety.
Differential Revision: https://reviews.llvm.org/D114288
Right now if the LLDB is compiled under the windows with static vcruntime library, the -o and -k commands will not work.
The problem is that the LLDB create FILE* in lldb.exe and pass it to liblldb.dll which is an object from CRT.
Since the CRT is statically linked each of these module has its own copy of the CRT with it's own global state and the LLDB should not share CRT objects between them.
In this change I moved the logic of creating FILE* out of commands stream from Driver class to SBDebugger.
To do this I added new method: SBError SBDebugger::SetInputStream(SBStream &stream)
Command to build the LLDB:
cmake -G Ninja -DLLVM_ENABLE_PROJECTS="clang;lldb;libcxx" -DLLVM_USE_CRT_RELEASE="MT" -DLLVM_USE_CRT_MINSIZEREL="MT" -DLLVM_USE_CRT_RELWITHDEBINFO="MT" -DP
YTHON_HOME:FILEPATH=C:/Python38 -DCMAKE_C_COMPILER:STRING=cl.exe -DCMAKE_CXX_COMPILER:STRING=cl.exe ../llvm
Command which will fail:
lldb.exe -o help
See discord discussion for more details: https://discord.com/channels/636084430946959380/636732809708306432/854629125398724628
This revision is for the further discussion.
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D104413
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master in these comments.
Reviewed By: clayborg, JDevlieghere
Differential Revision: https://reviews.llvm.org/D114123
This is part of https://github.com/dlang/projects/issues/81 .
This patch enables support for D programming language demangler by using a
pretty printed stacktrace with demangled D symbols, when present.
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
Reviewed By: JDevlieghere, teemperor
Differential Revision: https://reviews.llvm.org/D110578
The amount of roundtrips between StringRefs, ConstStrings and std::strings is
getting a bit out of hand, this patch avoid the unnecessary roundtrips.
Reviewed By: wallace, aprantl
Differential Revision: https://reviews.llvm.org/D112863
This has no uses and the ValueObjectDynamicValue already tracks
its ownership through the parent it is passed when made. I can't
find any vestiges of the use of this API, maybe it was from some
earlier design?
Resetting the backing ivar was the only job the destructor did, so I
set that to default as well.
Differential Revision: https://reviews.llvm.org/D112677
The new key/value pairs that are added to each module's stats are:
"debugInfoByteSize": The size in bytes of debug info for each module.
"debugInfoIndexTime": The time in seconds that it took to index the debug info.
"debugInfoParseTime": The time in seconds that debug info had to be parsed.
At the top level we add up all of the debug info size, parse time and index time with the following keys:
"totalDebugInfoByteSize": The size in bytes of all debug info in all modules.
"totalDebugInfoIndexTime": The time in seconds that it took to index all debug info if it was indexed for all modules.
"totalDebugInfoParseTime": The time in seconds that debug info was parsed for all modules.
Differential Revision: https://reviews.llvm.org/D112501
Add a Communication::WriteAll() that resumes Write() if the initial call
did not write all data. Use it in GDBRemoteCommunication when sending
packets in order to fix handling partial writes (i.e. just resume/retry
them rather than erring out). This fixes LLDB failures when writing
large packets to a pty.
Differential Revision: https://reviews.llvm.org/D112169
This patch deals with ObjectFile, ObjectContainer and OperatingSystem
plugins. I'll convert the other types in separate patches.
In order to enable piecemeal conversion, I am leaving some ConstStrings
in the lowest PluginManager layers. I'll convert those as the last step.
Differential Revision: https://reviews.llvm.org/D112061
When printing names in lldb on windows these names contain the full type information while on linux only the name is contained.
This change introduces a flag in the Microsoft demangler to control if the type information should be included.
With the flag enabled demangled name contains only the qualified name, e.g:
without flag -> with flag
int (*array2d)[10] -> array2d
int (*abc::array2d)[10] -> abc::array2d
const int *x -> x
For globals there is a second inconsistency which is not yet addressed by this change. On linux globals (in global namespace) are prefixed with :: while on windows they are not.
Reviewed By: teemperor, rnk
Differential Revision: https://reviews.llvm.org/D111715
There is no reason why this function should be returning a ConstString.
While modifying these files, I also fixed several instances where
GetPluginName and GetPluginNameStatic were returning different strings.
I am not changing the return type of GetPluginNameStatic in this patch, as that
would necessitate additional changes, and this patch is big enough as it is.
Differential Revision: https://reviews.llvm.org/D111877
When we know the bounds of the array, print any embedded nuls instead of
treating them as terminators. An exception to this rule is made for the
nul character at the very end of the string. We don't print that, as
otherwise 99% of the strings would end in \0. This way the strings
usually come out the same as how the user typed it into the compiler
(char foo[] = "with\0nuls"). It also matches how they come out in gdb.
This resolves a FIXME left from D111399, and leaves another FIXME for dealing
with nul characters in "escape-non-printables=false" mode. In this mode the
characters cause the entire summary string to be terminated prematurely.
Differential Revision: https://reviews.llvm.org/D111634
.. and reduce the scope of others. They don't follow llvm coding
standards (which say they should be used only when the same effect
cannot be achieved with the static keyword), and they set a bad example.
Update GetRegisterInfoByName() methods to support getting registers
by a generic name independently of alt_name entries in the register
context. This makes it possible to use generic names when interacting
with gdbserver (that does not supply alt_names). It also makes it
possible to remove some of the duplicated information from register
context declarations and/or use alt_names for another purpose.
Differential Revision: https://reviews.llvm.org/D108554
Extend PluginManager::SaveCore() to support saving core dumps
via Process plugins. Implement the client-side part of qSaveCore
request in the gdb-remote plugin, that creates the core dump
on the remote host and then uses vFile packets to transfer it.
Differential Revision: https://reviews.llvm.org/D101329
This change adds save-core functionality into the ObjectFileELF that enables
saving minidump of a stopped process. This change is mainly targeting Linux
running on x86_64 machines. Minidump should contain basic information needed
to examine state of threads, local variables and stack traces. Full support
for other platforms is not so far implemented. API tests are using LLDB's
MinidumpParser.
This relands commit aafa05e, reverted in 1f986f6.
Failed tests were fixed.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D108233