The rust demangler has some odd buffer handling code, which will copy
the demangled string into the provided buffer, if it will fit.
Otherwise it uses the allocated buffer it made. But the length of the
incoming buffer will have come from a previous call, which was the
length of the demangled string -- not the buffer size. And of course,
we're unconditionally allocating a temporary buffer in the first
place. So we don't actually get buffer reuse, and we get a memcpy in
somecases.
However, nothing in LLVM ever passes in a non-null pointer. Neither
does anything pass in a status pointer that is then made use of. The
only exercise these have is in the test suite.
So let's just make the rust demangler have the same API as the dlang
demangler.
Reviewed By: tmiasko
Differential Revision: https://reviews.llvm.org/D123420
The demangler has a utility class 'SwapAndRestore'. That name is
confusing. It's not swapping anything, and the restore part happens at
the object's destruction. What it's actually doing is allowing a
override of some value that is dynamically accessible within the
lifetime of a lexical scope. Thus rename it to ScopedOverride, and
tweak it's member variable names.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D122606
The OutputBuffer class tries to present a NUL-terminated string API to
consumers. But several of them would prefer a StringView. In
particular the Microsoft demangler, juggles between NUL-terminated and
StringView, which is confusing.
This adds a StringView conversion, and adjusts the Demanglers that can
benefit from that.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D120990
The microsoft demangler makes copies of the demangled strings, but has
some confusion between StringView representation (sans NUL), and
C-strings (with NUL). Here we also have a use of strcpy, which
happens to work because the incoming string view happens to have a
trailing NUL. But a simple memcpy excluding the NUL is sufficient.
Reviewed By: dblaikie, erichkeane
Differential Revision: https://reviews.llvm.org/D122391
Add support for module name demangling. We have two new demangler
nodes -- ModuleName and ModuleEntity. The former represents a module
name in a hierarchical fashion. The latter is the combination of a
(name) node and a module name. Because module names and entity
identities use the same substitution encoding, we have to adjust the
flow of how substitutions are handled, and examine the substituted
node to know how to deal with it.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D119933
The StdQualifiedName node class is used for names exactly in the std
namespace. It is not used for nested names that descend further --
those use a NestedName with NameType("std") as the scope.
Representing the compression scheme in the node graph is layer
breaking. We can use the same structure for those exactly in std too,
and reduce code size a bit.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D118249
As an hint to the impact of the cleanup, running
clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Demangle/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 208053 lines
after: 203965 lines
Since Ret parameter is never meant to be nullptr, let's pass it by reference instead of a raw pointer.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D117046
This patch adds support for type back referencing, allowing demangling of
compressed mangled symbols with repetitive types.
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111419
This patch adds support for identifier back referencing allowing compressed
mangled names by avoiding repetitiveness.
Signed-off-by: Luís Ferreira <contact@lsferreira.net>
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111417
This patch implements simple demangling of two basic types to add minimal type functionality. This will be later used in function type parsing. After that being implemented we can add the rest of the types and test the result of the type name.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D111416
Internally `__Sddd` function-local parent symbols are used to solve ambiguities on symbols in
the same scope with the same mangled name.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114309
Anonymous symbols are represented by 0 in the mangled symbol. We should skip
them in order to represent the demangled name correctly, otherwise demangled
names like `demangle..anon` can happen.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D114307
This patch adds support for simple single qualified names that includes
internal mangled names and normal symbol names.
Differential Revision: https://reviews.llvm.org/D111415
This patch adds minimal support for D programming language demangling on LLVM
core based on the D name mangling spec. This will allow easier integration on a
future LLDB plugin for D either in the upstream tree or outside of it.
Minimal support includes recognizing D demangling encoding and at least one
mangling name, which in this case is `_Dmain` mangle.
Reviewed By: jhenderson, lattner
Differential Revision: https://reviews.llvm.org/D111414
This patch is a refactor to implement prepend afterwards. Since this changes a lot of files and to conform with guidelines, I will separate this from the implementation of prepend. Related to the discussion in https://reviews.llvm.org/D111414 , so please read it for more context.
Reviewed By: #libc_abi, dblaikie, ldionne
Differential Revision: https://reviews.llvm.org/D111947
When printing names in lldb on windows these names contain the full type information while on linux only the name is contained.
This change introduces a flag in the Microsoft demangler to control if the type information should be included.
With the flag enabled demangled name contains only the qualified name, e.g:
without flag -> with flag
int (*array2d)[10] -> array2d
int (*abc::array2d)[10] -> abc::array2d
const int *x -> x
For globals there is a second inconsistency which is not yet addressed by this change. On linux globals (in global namespace) are prefixed with :: while on windows they are not.
Reviewed By: teemperor, rnk
Differential Revision: https://reviews.llvm.org/D111715
Introduce a new demangling function that supports symbols using Itanium
mangling and Rust v0 mangling, and is expected in the near future to
include support for D mangling as well.
Unlike llvm::demangle, the function does not accept extra underscore
decoration. The callers generally know exactly when symbols should
include the extra decoration and so they should be responsible for
stripping it.
Functionally the only intended change is to allow demangling Rust
symbols with an extra underscore decoration through llvm::demangle,
which matches the existing behaviour for Itanium symbols.
Reviewed By: dblaikie, jhenderson
Part of https://reviews.llvm.org/D110664
Rust allows use of non-ASCII identifiers, which in Rust mangling scheme
are encoded using Punycode.
The encoding deviates from the standard by using an underscore as the
separator between ASCII part and a base-36 encoding of non-ASCII
characters (avoiding hypen-minus in the symbol name). Other than that,
the encoding follows the standard, and the decoder implemented here in
turn follows the one given in RFC 3492.
To avoid an extra intermediate memory allocation while decoding
Punycode, the interface of OutputStream is extended with an insert
method.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104366
This change is intended as initial setup. The plan is to add
more semantic checks later. I plan to update the documentation
as more semantic checks are added (instead of documenting the
details up front). Most of the code closely mirrors that for
the Swift calling convention. Three places are marked as
[FIXME: swiftasynccc]; those will be addressed once the
corresponding convention is introduced in LLVM.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D95561
Move content of the "public" header into the implementation file.
This also renames two enumerations that were previously used through
`rust_demangle::` scope, to avoid breaking a build bot with older
version of GCC that rejects uses of enumerator through `E::A` if there
is a variable with the same name as enumeration `E` in the scope.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104362
Allow mangled names to include an arbitrary dot suffix, akin to vendor
specific suffix in Itanium mangling.
Primary motivation is a support for symbols renamed during ThinLTO
import / promotion (ThinLTO is the default configuration for optimized
builds in rustc).
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104358
The llvm::demangle is currently used by llvm-objdump and llvm-readobj,
so this effectively adds support for Rust v0 mangling to those
applications.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104340