llvm-project/lldb/source/Core/Mangled.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

505 lines
17 KiB
C++
Raw Normal View History

//===-- Mangled.cpp -------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "lldb/Core/Mangled.h"
Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster. This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes: - We no longer modify modification times of the cache files - Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp) - Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed - Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented. The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items: - object file UUID if available - object file mod time if available - object name for BSD archive .o files that are in .a files if available If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached. When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created. Module caching must be enabled by the user before this can be used: symbols.enable-lldb-index-cache (boolean) = false (lldb) settings set symbols.enable-lldb-index-cache true There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory: (lldb) settings show symbols.lldb-index-cache-path /var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug. Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues. If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed. The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings: (lldb) settings set symbols.lldb-index-cache-expiration-days <days> A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently. (lldb) settings set symbols.lldb-index-cache-max-byte-size <size> A value of zero will disable pruning based on a total byte size. The default value is zero currently. (lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space> A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero. Reviewed By: labath, wallace Differential Revision: https://reviews.llvm.org/D115324
2021-12-17 01:59:25 +08:00
#include "lldb/Core/DataFileCache.h"
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
#include "lldb/Core/RichManglingContext.h"
#include "lldb/Target/Language.h"
#include "lldb/Utility/ConstString.h"
Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster. This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes: - We no longer modify modification times of the cache files - Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp) - Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed - Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented. The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items: - object file UUID if available - object file mod time if available - object name for BSD archive .o files that are in .a files if available If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached. When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created. Module caching must be enabled by the user before this can be used: symbols.enable-lldb-index-cache (boolean) = false (lldb) settings set symbols.enable-lldb-index-cache true There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory: (lldb) settings show symbols.lldb-index-cache-path /var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug. Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues. If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed. The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings: (lldb) settings set symbols.lldb-index-cache-expiration-days <days> A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently. (lldb) settings set symbols.lldb-index-cache-max-byte-size <size> A value of zero will disable pruning based on a total byte size. The default value is zero currently. (lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space> A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero. Reviewed By: labath, wallace Differential Revision: https://reviews.llvm.org/D115324
2021-12-17 01:59:25 +08:00
#include "lldb/Utility/DataEncoder.h"
#include "lldb/Utility/LLDBLog.h"
#include "lldb/Utility/Log.h"
#include "lldb/Utility/RegularExpression.h"
#include "lldb/Utility/Stream.h"
#include "lldb/lldb-enumerations.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Demangle/Demangle.h"
#include "llvm/Support/Compiler.h"
#include <mutex>
#include <string>
#include <utility>
#include <cstdlib>
#include <cstring>
using namespace lldb_private;
static inline bool cstring_is_mangled(llvm::StringRef s) {
return Mangled::GetManglingScheme(s) != Mangled::eManglingSchemeNone;
}
#pragma mark Mangled
Mangled::ManglingScheme Mangled::GetManglingScheme(llvm::StringRef const name) {
if (name.empty())
return Mangled::eManglingSchemeNone;
if (name.startswith("?"))
return Mangled::eManglingSchemeMSVC;
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
if (name.startswith("_R"))
return Mangled::eManglingSchemeRustV0;
if (name.startswith("_D"))
return Mangled::eManglingSchemeD;
if (name.startswith("_Z"))
return Mangled::eManglingSchemeItanium;
// ___Z is a clang extension of block invocations
if (name.startswith("___Z"))
return Mangled::eManglingSchemeItanium;
return Mangled::eManglingSchemeNone;
}
Mangled::Mangled(ConstString s) : m_mangled(), m_demangled() {
if (s)
SetValue(s);
}
Mangled::Mangled(llvm::StringRef name) {
if (!name.empty())
SetValue(ConstString(name));
}
// Convert to bool operator. This allows code to check any Mangled objects
// to see if they contain anything valid using code such as:
//
// Mangled mangled(...);
// if (mangled)
// { ...
Mangled::operator bool() const { return m_mangled || m_demangled; }
// Clear the mangled and demangled values.
void Mangled::Clear() {
m_mangled.Clear();
m_demangled.Clear();
}
// Compare the string values.
int Mangled::Compare(const Mangled &a, const Mangled &b) {
return ConstString::Compare(a.GetName(ePreferMangled),
b.GetName(ePreferMangled));
}
// Set the string value in this objects. If "mangled" is true, then the mangled
// named is set with the new value in "s", else the demangled name is set.
void Mangled::SetValue(ConstString s, bool mangled) {
if (s) {
if (mangled) {
m_demangled.Clear();
m_mangled = s;
} else {
m_demangled = s;
m_mangled.Clear();
}
} else {
m_demangled.Clear();
m_mangled.Clear();
}
}
void Mangled::SetValue(ConstString name) {
if (name) {
if (cstring_is_mangled(name.GetStringRef())) {
m_demangled.Clear();
m_mangled = name;
} else {
m_demangled = name;
m_mangled.Clear();
}
} else {
m_demangled.Clear();
m_mangled.Clear();
}
}
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
// Local helpers for different demangling implementations.
static char *GetMSVCDemangledStr(const char *M) {
char *demangled_cstr = llvm::microsoftDemangle(
M, nullptr, nullptr, nullptr, nullptr,
llvm::MSDemangleFlags(
llvm::MSDF_NoAccessSpecifier | llvm::MSDF_NoCallingConvention |
llvm::MSDF_NoMemberType | llvm::MSDF_NoVariableType));
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
if (Log *log = GetLog(LLDBLog::Demangle)) {
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
if (demangled_cstr && demangled_cstr[0])
LLDB_LOGF(log, "demangled msvc: %s -> \"%s\"", M, demangled_cstr);
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
else
LLDB_LOGF(log, "demangled msvc: %s -> error", M);
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
}
return demangled_cstr;
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
}
static char *GetItaniumDemangledStr(const char *M) {
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
char *demangled_cstr = nullptr;
llvm::ItaniumPartialDemangler ipd;
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
bool err = ipd.partialDemangle(M);
if (!err) {
// Default buffer and size (will realloc in case it's too small).
size_t demangled_size = 80;
demangled_cstr = static_cast<char *>(std::malloc(demangled_size));
demangled_cstr = ipd.finishDemangle(demangled_cstr, &demangled_size);
assert(demangled_cstr &&
"finishDemangle must always succeed if partialDemangle did");
assert(demangled_cstr[demangled_size - 1] == '\0' &&
"Expected demangled_size to return length including trailing null");
}
if (Log *log = GetLog(LLDBLog::Demangle)) {
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
if (demangled_cstr)
LLDB_LOGF(log, "demangled itanium: %s -> \"%s\"", M, demangled_cstr);
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
else
LLDB_LOGF(log, "demangled itanium: %s -> error: failed to demangle", M);
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
}
return demangled_cstr;
}
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
static char *GetRustV0DemangledStr(const char *M) {
char *demangled_cstr = llvm::rustDemangle(M);
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
if (Log *log = GetLog(LLDBLog::Demangle)) {
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
if (demangled_cstr && demangled_cstr[0])
LLDB_LOG(log, "demangled rustv0: {0} -> \"{1}\"", M, demangled_cstr);
else
LLDB_LOG(log, "demangled rustv0: {0} -> error: failed to demangle", M);
}
return demangled_cstr;
}
static char *GetDLangDemangledStr(const char *M) {
char *demangled_cstr = llvm::dlangDemangle(M);
if (Log *log = GetLog(LLDBLog::Demangle)) {
if (demangled_cstr && demangled_cstr[0])
LLDB_LOG(log, "demangled dlang: {0} -> \"{1}\"", M, demangled_cstr);
else
LLDB_LOG(log, "demangled dlang: {0} -> error: failed to demangle", M);
}
return demangled_cstr;
}
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
// Explicit demangling for scheduled requests during batch processing. This
// makes use of ItaniumPartialDemangler's rich demangle info
bool Mangled::GetRichManglingInfo(RichManglingContext &context,
SkipMangledNameFn *skip_mangled_name) {
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
// Others are not meant to arrive here. ObjC names or C's main() for example
// have their names stored in m_demangled, while m_mangled is empty.
assert(m_mangled);
// Check whether or not we are interested in this name at all.
ManglingScheme scheme = GetManglingScheme(m_mangled.GetStringRef());
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
if (skip_mangled_name && skip_mangled_name(m_mangled.GetStringRef(), scheme))
return false;
switch (scheme) {
case eManglingSchemeNone:
// The current mangled_name_filter would allow llvm_unreachable here.
return false;
case eManglingSchemeItanium:
// We want the rich mangling info here, so we don't care whether or not
// there is a demangled string in the pool already.
return context.FromItaniumName(m_mangled);
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
case eManglingSchemeMSVC: {
// We have no rich mangling for MSVC-mangled names yet, so first try to
// demangle it if necessary.
if (!m_demangled && !m_mangled.GetMangledCounterpart(m_demangled)) {
if (char *d = GetMSVCDemangledStr(m_mangled.GetCString())) {
// Without the rich mangling info we have to demangle the full name.
// Copy it to string pool and connect the counterparts to accelerate
// later access in GetDemangledName().
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
m_demangled.SetStringWithMangledCounterpart(llvm::StringRef(d),
m_mangled);
::free(d);
} else {
m_demangled.SetCString("");
}
}
if (m_demangled.IsEmpty()) {
// Cannot demangle it, so don't try parsing.
return false;
} else {
// Demangled successfully, we can try and parse it with
// CPlusPlusLanguage::MethodName.
return context.FromCxxMethodName(m_demangled);
}
}
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
case eManglingSchemeRustV0:
case eManglingSchemeD:
// Rich demangling scheme is not supported
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
return false;
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
}
llvm_unreachable("Fully covered switch above!");
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
}
// Generate the demangled name on demand using this accessor. Code in this
// class will need to use this accessor if it wishes to decode the demangled
// name. The result is cached and will be kept until a new string value is
// supplied to this object, or until the end of the object's lifetime.
ConstString Mangled::GetDemangledName() const {
// Check to make sure we have a valid mangled name and that we haven't
// already decoded our mangled name.
if (m_mangled && m_demangled.IsNull()) {
// Don't bother running anything that isn't mangled
const char *mangled_name = m_mangled.GetCString();
ManglingScheme mangling_scheme =
GetManglingScheme(m_mangled.GetStringRef());
if (mangling_scheme != eManglingSchemeNone &&
!m_mangled.GetMangledCounterpart(m_demangled)) {
// We didn't already mangle this name, demangle it and if all goes well
// add it to our map.
char *demangled_name = nullptr;
switch (mangling_scheme) {
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
case eManglingSchemeMSVC:
demangled_name = GetMSVCDemangledStr(mangled_name);
break;
case eManglingSchemeItanium: {
demangled_name = GetItaniumDemangledStr(mangled_name);
break;
}
[lldb] Enable Rust v0 symbol demangling Rust's v0 name mangling scheme [1] is easy to disambiguate from other name mangling schemes because symbols always start with `_R`. The llvm Demangle library supports demangling the Rust v0 scheme. Use it to demangle Rust symbols. Added unit tests that check simple symbols. Ran LLDB built with this patch to debug some Rust programs compiled with the v0 name mangling scheme. Confirmed symbol names were demangled as expected. Note: enabling the new name mangling scheme requires a nightly toolchain: ``` $ cat main.rs fn main() { println!("Hello world!"); } $ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g $ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2' (lldb) target create "./main" Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64). (lldb) b main.rs:2 Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4 (lldb) r Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64) warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section Process 948449 stopped * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 1 fn main() { -> 2 println!("Hello world!"); 3 } (lldb) bt error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0 * thread #1, name = 'main', stop reason = breakpoint 1.1 * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5 frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5 frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18 frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18 frame #4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13 frame #5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40 frame #6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19 frame #7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14 frame #8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21 frame #9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5 frame #10: 0x000055555555b6fc main`main + 28 frame #11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243 frame #12: 0x000055555555b59e main`_start + 46 (lldb) ``` [1]: https://github.com/rust-lang/rust/issues/60705 Reviewed By: clayborg, teemperor Differential Revision: https://reviews.llvm.org/D104054
2021-06-21 23:56:55 +08:00
case eManglingSchemeRustV0:
demangled_name = GetRustV0DemangledStr(mangled_name);
break;
case eManglingSchemeD:
demangled_name = GetDLangDemangledStr(mangled_name);
break;
case eManglingSchemeNone:
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
llvm_unreachable("eManglingSchemeNone was handled already");
}
if (demangled_name) {
Use rich mangling information in Symtab::InitNameIndexes() Summary: I set up a new review, because not all the code I touched was marked as a change in old one anymore. In preparation for this review, there were two earlier ones: * https://reviews.llvm.org/D49612 introduced the ItaniumPartialDemangler to LLDB demangling without conceptual changes * https://reviews.llvm.org/D49909 added a unit test that covers all relevant code paths in the InitNameIndexes() function Primary goals for this patch are: (1) Use ItaniumPartialDemangler's rich mangling info for building LLDB's name index. (2) Provide a uniform interface. (3) Improve indexing performance. The central implementation in this patch is our new function for explicit demangling: ``` const RichManglingInfo * Mangled::DemangleWithRichManglingInfo(RichManglingContext &, SkipMangledNameFn *) ``` It takes a context object and a filter function and provides read-only access to the rich mangling info on success, or otherwise returns null. The two new classes are: * `RichManglingInfo` offers a uniform interface to query symbol properties like `getFunctionDeclContextName()` or `isCtorOrDtor()` that are forwarded to the respective provider internally (`llvm::ItaniumPartialDemangler` or `lldb_private::CPlusPlusLanguage::MethodName`). * `RichManglingContext` works a bit like `LLVMContext`, it the actual `RichManglingInfo` returned from `DemangleWithRichManglingInfo()` and handles lifetime and configuration. It is likely stack-allocated and can be reused for multiple queries during batch processing. The idea here is that `DemangleWithRichManglingInfo()` acts like a gate keeper. It only provides access to `RichManglingInfo` on success, which in turn avoids the need to handle a `NoInfo` state in every single one of its getters. Having it stored within the context, avoids extra heap allocations and aids (3). As instantiations of the IPD the are considered expensive, the context is the ideal place to store it too. An efficient filtering function `SkipMangledNameFn` is another piece in the performance puzzle and it helps to mimic the original behavior of `InitNameIndexes`. Future potential: * `DemangleWithRichManglingInfo()` is thread-safe, IFF using different contexts in different threads. This may be exploited in the future. (It's another thing that it has in common with `LLVMContext`.) * The old implementation only parsed and indexed Itanium mangled names. The new `RichManglingInfo` can be extended for various mangling schemes and languages. One problem with the implementation of RichManglingInfo is the inaccessibility of class `CPlusPlusLanguage::MethodName` (defined in source/Plugins/Language/..), from within any header in the Core components of LLDB. The rather hacky solution is to store a type erased reference and cast it to the correct type on access in the cpp - see `RichManglingInfo::get<ParserT>()`. At the moment there seems to be no better way to do it. IMHO `CPlusPlusLanguage::MethodName` should be a top-level class in order to enable forward delcarations (but that is a rather big change I guess). First simple profiling shows a good speedup. `target create clang` now takes 0.64s on average. Before the change I observed runtimes between 0.76s an 1.01s. This is still no bulletproof data (I only ran it on one machine!), but it's a promising indicator I think. Reviewers: labath, jingham, JDevlieghere, erik.pilkington Subscribers: zturner, clayborg, mgorny, lldb-commits Differential Revision: https://reviews.llvm.org/D50071 llvm-svn: 339291
2018-08-09 05:57:37 +08:00
m_demangled.SetStringWithMangledCounterpart(
llvm::StringRef(demangled_name), m_mangled);
free(demangled_name);
}
}
if (m_demangled.IsNull()) {
// Set the demangled string to the empty string to indicate we tried to
// parse it once and failed.
m_demangled.SetCString("");
}
}
return m_demangled;
}
ConstString Mangled::GetDisplayDemangledName() const {
return GetDemangledName();
}
bool Mangled::NameMatches(const RegularExpression &regex) const {
if (m_mangled && regex.Execute(m_mangled.GetStringRef()))
return true;
ConstString demangled = GetDemangledName();
return demangled && regex.Execute(demangled.GetStringRef());
}
// Get the demangled name if there is one, else return the mangled name.
ConstString Mangled::GetName(Mangled::NamePreference preference) const {
if (preference == ePreferMangled && m_mangled)
return m_mangled;
// Call the accessor to make sure we get a demangled name in case it hasn't
// been demangled yet...
ConstString demangled = GetDemangledName();
if (preference == ePreferDemangledWithoutArguments) {
if (Language *lang = Language::FindPlugin(GuessLanguage())) {
return lang->GetDemangledFunctionNameWithoutArguments(*this);
}
}
if (preference == ePreferDemangled) {
if (demangled)
return demangled;
return m_mangled;
}
return demangled;
}
// Dump a Mangled object to stream "s". We don't force our demangled name to be
// computed currently (we don't use the accessor).
void Mangled::Dump(Stream *s) const {
if (m_mangled) {
*s << ", mangled = " << m_mangled;
}
if (m_demangled) {
const char *demangled = m_demangled.AsCString();
s->Printf(", demangled = %s", demangled[0] ? demangled : "<error>");
}
}
// Dumps a debug version of this string with extra object and state information
// to stream "s".
void Mangled::DumpDebug(Stream *s) const {
s->Printf("%*p: Mangled mangled = ", static_cast<int>(sizeof(void *) * 2),
static_cast<const void *>(this));
m_mangled.DumpDebug(s);
s->Printf(", demangled = ");
m_demangled.DumpDebug(s);
}
// Return the size in byte that this object takes in memory. The size includes
// the size of the objects it owns, and not the strings that it references
// because they are shared strings.
size_t Mangled::MemorySize() const {
return m_mangled.MemorySize() + m_demangled.MemorySize();
}
// We "guess" the language because we can't determine a symbol's language from
// it's name. For example, a Pascal symbol can be mangled using the C++
// Itanium scheme, and defined in a compilation unit within the same module as
// other C++ units. In addition, different targets could have different ways
// of mangling names from a given language, likewise the compilation units
// within those targets.
lldb::LanguageType Mangled::GuessLanguage() const {
lldb::LanguageType result = lldb::eLanguageTypeUnknown;
// Ask each language plugin to check if the mangled name belongs to it.
Language::ForEach([this, &result](Language *l) {
if (l->SymbolNameFitsToLanguage(*this)) {
result = l->GetLanguageType();
return false;
}
return true;
});
return result;
}
// Dump OBJ to the supplied stream S.
Stream &operator<<(Stream &s, const Mangled &obj) {
if (obj.GetMangledName())
s << "mangled = '" << obj.GetMangledName() << "'";
ConstString demangled = obj.GetDemangledName();
if (demangled)
s << ", demangled = '" << demangled << '\'';
else
s << ", demangled = <error>";
return s;
}
Added the ability to cache the finalized symbol tables subsequent debug sessions to start faster. This is an updated version of the https://reviews.llvm.org/D113789 patch with the following changes: - We no longer modify modification times of the cache files - Use LLVM caching and cache pruning instead of making a new cache mechanism (See DataFileCache.h/.cpp) - Add signature to start of each file since we are not using modification times so we can tell when caches are stale and remove and re-create the cache file as files are changed - Add settings to control the cache size, disk percentage and expiration in days to keep cache size under control This patch enables symbol tables to be cached in the LLDB index cache directory. All cache files are in a single directory and the files use unique names to ensure that files from the same path will re-use the same file as files get modified. This means as files change, their cache files will be deleted and updated. The modification time of each of the cache files is not modified so that access based pruning of the cache can be implemented. The symbol table cache files start with a signature that uniquely identifies a file on disk and contains one or more of the following items: - object file UUID if available - object file mod time if available - object name for BSD archive .o files that are in .a files if available If none of these signature items are available, then the file will not be cached. This keeps temporary object files from expressions from being cached. When the cache files are loaded on subsequent debug sessions, the signature is compare and if the file has been modified (uuid changes, mod time changes, or object file mod time changes) then the cache file is deleted and re-created. Module caching must be enabled by the user before this can be used: symbols.enable-lldb-index-cache (boolean) = false (lldb) settings set symbols.enable-lldb-index-cache true There is also a setting that allows the user to specify a module cache directory that defaults to a directory that defaults to being next to the symbols.clang-modules-cache-path directory in a temp directory: (lldb) settings show symbols.lldb-index-cache-path /var/folders/9p/472sr0c55l9b20x2zg36b91h0000gn/C/lldb/IndexCache If this setting is enabled, the finalized symbol tables will be serialized and saved to disc so they can be quickly loaded next time you debug. Each module can cache one or more files in the index cache directory. The cache file names must be unique to a file on disk and its architecture and object name for .o files in BSD archives. This allows universal mach-o files to support caching multuple architectures in the same module cache directory. Making the file based on the this info allows this cache file to be deleted and replaced when the file gets updated on disk. This keeps the cache from growing over time during the compile/edit/debug cycle and prevents out of space issues. If the cache is enabled, the symbol table will be loaded from the cache the next time you debug if the module has not changed. The cache also has settings to control the size of the cache on disk. Each time LLDB starts up with the index cache enable, the cache will be pruned to ensure it stays within the user defined settings: (lldb) settings set symbols.lldb-index-cache-expiration-days <days> A value of zero will disable cache files from expiring when the cache is pruned. The default value is 7 currently. (lldb) settings set symbols.lldb-index-cache-max-byte-size <size> A value of zero will disable pruning based on a total byte size. The default value is zero currently. (lldb) settings set symbols.lldb-index-cache-max-percent <percentage-of-disk-space> A value of 100 will allow the disc to be filled to the max, a value of zero will disable percentage pruning. The default value is zero. Reviewed By: labath, wallace Differential Revision: https://reviews.llvm.org/D115324
2021-12-17 01:59:25 +08:00
// When encoding Mangled objects we can get away with encoding as little
// information as is required. The enumeration below helps us to efficiently
// encode Mangled objects.
enum MangledEncoding {
/// If the Mangled object has neither a mangled name or demangled name we can
/// encode the object with one zero byte using the Empty enumeration.
Empty = 0u,
/// If the Mangled object has only a demangled name and no mangled named, we
/// can encode only the demangled name.
DemangledOnly = 1u,
/// If the mangle name can calculate the demangled name (it is the
/// mangled/demangled counterpart), then we only need to encode the mangled
/// name as the demangled name can be recomputed.
MangledOnly = 2u,
/// If we have a Mangled object with two different names that are not related
/// then we need to save both strings. This can happen if we have a name that
/// isn't a true mangled name, but we want to be able to lookup a symbol by
/// name and type in the symbol table. We do this for Objective C symbols like
/// "OBJC_CLASS_$_NSValue" where the mangled named will be set to
/// "OBJC_CLASS_$_NSValue" and the demangled name will be manually set to
/// "NSValue". If we tried to demangled the name "OBJC_CLASS_$_NSValue" it
/// would fail, but in these cases we want these unrelated names to be
/// preserved.
MangledAndDemangled = 3u
};
bool Mangled::Decode(const DataExtractor &data, lldb::offset_t *offset_ptr,
const StringTableReader &strtab) {
m_mangled.Clear();
m_demangled.Clear();
MangledEncoding encoding = (MangledEncoding)data.GetU8(offset_ptr);
switch (encoding) {
case Empty:
return true;
case DemangledOnly:
m_demangled.SetString(strtab.Get(data.GetU32(offset_ptr)));
return true;
case MangledOnly:
m_mangled.SetString(strtab.Get(data.GetU32(offset_ptr)));
return true;
case MangledAndDemangled:
m_mangled.SetString(strtab.Get(data.GetU32(offset_ptr)));
m_demangled.SetString(strtab.Get(data.GetU32(offset_ptr)));
return true;
}
return false;
}
/// The encoding format for the Mangled object is as follows:
///
/// uint8_t encoding;
/// char str1[]; (only if DemangledOnly, MangledOnly)
/// char str2[]; (only if MangledAndDemangled)
///
/// The strings are stored as NULL terminated UTF8 strings and str1 and str2
/// are only saved if we need them based on the encoding.
///
/// Some mangled names have a mangled name that can be demangled by the built
/// in demanglers. These kinds of mangled objects know when the mangled and
/// demangled names are the counterparts for each other. This is done because
/// demangling is very expensive and avoiding demangling the same name twice
/// saves us a lot of compute time. For these kinds of names we only need to
/// save the mangled name and have the encoding set to "MangledOnly".
///
/// If a mangled obejct has only a demangled name, then we save only that string
/// and have the encoding set to "DemangledOnly".
///
/// Some mangled objects have both mangled and demangled names, but the
/// demangled name can not be computed from the mangled name. This is often used
/// for runtime named, like Objective C runtime V2 and V3 names. Both these
/// names must be saved and the encoding is set to "MangledAndDemangled".
///
/// For a Mangled object with no names, we only need to set the encoding to
/// "Empty" and not store any string values.
void Mangled::Encode(DataEncoder &file, ConstStringTable &strtab) const {
MangledEncoding encoding = Empty;
if (m_mangled) {
encoding = MangledOnly;
if (m_demangled) {
// We have both mangled and demangled names. If the demangled name is the
// counterpart of the mangled name, then we only need to save the mangled
// named. If they are different, we need to save both.
ConstString s;
if (!(m_mangled.GetMangledCounterpart(s) && s == m_demangled))
encoding = MangledAndDemangled;
}
} else if (m_demangled) {
encoding = DemangledOnly;
}
file.AppendU8(encoding);
switch (encoding) {
case Empty:
break;
case DemangledOnly:
file.AppendU32(strtab.Add(m_demangled));
break;
case MangledOnly:
file.AppendU32(strtab.Add(m_mangled));
break;
case MangledAndDemangled:
file.AppendU32(strtab.Add(m_mangled));
file.AppendU32(strtab.Add(m_demangled));
break;
}
}