llvm-project/llvm/lib/Demangle/StringView.h

122 lines
3.2 KiB
C
Raw Normal View History

//===--- StringView.h -------------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//
// This file contains a limited version of LLVM's StringView class. It is
// copied here so that LLVMDemangle need not take a dependency on LLVMSupport.
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEMANGLE_STRINGVIEW_H
#define LLVM_DEMANGLE_STRINGVIEW_H
#include <algorithm>
#include <cassert>
2018-07-18 03:48:46 +08:00
#include <cstring>
class StringView {
const char *First;
const char *Last;
public:
[MS Demangler] Demangle symbols in function scopes. There are a couple of issues you run into when you start getting into more complex names, especially with regards to function local statics. When you've got something like: int x() { static int n = 0; return n; } Then this needs to demangle to something like int `int __cdecl x()'::`1'::n The nested mangled symbols (e.g. `int __cdecl x()` in the above example) also share state with regards to back-referencing, so we need to be able to re-use the demangler in the middle of demangling a symbol while sharing back-ref state. To make matters more complicated, there are a lot of ambiguities when demangling a symbol's qualified name, because a function local scope pattern (usually something like `?1??name?`) looks suspiciously like many other possible things that can occur, such as `?1` meaning the second back-ref and disambiguating these cases is rather interesting. The `?1?` in a local scope pattern is actually a special case of the more general pattern of `? + <encoded number> + ?`, where "encoded number" can itself have embedded `@` symbols, which is a common delimeter in mangled names. So we have to take care during the disambiguation, which is the reason for the overly complicated `isLocalScopePattern` function in this patch. I've added some pretty obnoxious tests to exercise all of this, which exposed several other problems related to back-referencing, so those are fixed here as well. Finally, I've uncommented some tests that were previously marked as `FIXME`, since now these work. Differential Revision: https://reviews.llvm.org/D49965 llvm-svn: 338226
2018-07-30 11:12:34 +08:00
static const size_t npos = ~size_t(0);
template <size_t N>
StringView(const char (&Str)[N]) : First(Str), Last(Str + N - 1) {}
StringView(const char *First_, const char *Last_)
: First(First_), Last(Last_) {}
StringView(const char *First_, size_t Len)
: First(First_), Last(First_ + Len) {}
2018-07-18 03:48:46 +08:00
StringView(const char *Str) : First(Str), Last(Str + std::strlen(Str)) {}
StringView() : First(nullptr), Last(nullptr) {}
StringView substr(size_t From) const {
return StringView(begin() + From, size() - From);
}
[MS Demangler] Demangle symbols in function scopes. There are a couple of issues you run into when you start getting into more complex names, especially with regards to function local statics. When you've got something like: int x() { static int n = 0; return n; } Then this needs to demangle to something like int `int __cdecl x()'::`1'::n The nested mangled symbols (e.g. `int __cdecl x()` in the above example) also share state with regards to back-referencing, so we need to be able to re-use the demangler in the middle of demangling a symbol while sharing back-ref state. To make matters more complicated, there are a lot of ambiguities when demangling a symbol's qualified name, because a function local scope pattern (usually something like `?1??name?`) looks suspiciously like many other possible things that can occur, such as `?1` meaning the second back-ref and disambiguating these cases is rather interesting. The `?1?` in a local scope pattern is actually a special case of the more general pattern of `? + <encoded number> + ?`, where "encoded number" can itself have embedded `@` symbols, which is a common delimeter in mangled names. So we have to take care during the disambiguation, which is the reason for the overly complicated `isLocalScopePattern` function in this patch. I've added some pretty obnoxious tests to exercise all of this, which exposed several other problems related to back-referencing, so those are fixed here as well. Finally, I've uncommented some tests that were previously marked as `FIXME`, since now these work. Differential Revision: https://reviews.llvm.org/D49965 llvm-svn: 338226
2018-07-30 11:12:34 +08:00
size_t find(char C, size_t From = 0) const {
size_t FindBegin = std::min(From, size());
// Avoid calling memchr with nullptr.
if (FindBegin < size()) {
// Just forward to memchr, which is faster than a hand-rolled loop.
if (const void *P = ::memchr(First + FindBegin, C, size() - FindBegin))
return static_cast<const char *>(P) - First;
}
return npos;
}
StringView substr(size_t From, size_t To) const {
if (To >= size())
To = size() - 1;
if (From >= size())
From = size() - 1;
return StringView(First + From, First + To);
}
StringView dropFront(size_t N = 1) const {
if (N >= size())
N = size();
return StringView(First + N, Last);
}
[MS Demangler] Demangle symbols in function scopes. There are a couple of issues you run into when you start getting into more complex names, especially with regards to function local statics. When you've got something like: int x() { static int n = 0; return n; } Then this needs to demangle to something like int `int __cdecl x()'::`1'::n The nested mangled symbols (e.g. `int __cdecl x()` in the above example) also share state with regards to back-referencing, so we need to be able to re-use the demangler in the middle of demangling a symbol while sharing back-ref state. To make matters more complicated, there are a lot of ambiguities when demangling a symbol's qualified name, because a function local scope pattern (usually something like `?1??name?`) looks suspiciously like many other possible things that can occur, such as `?1` meaning the second back-ref and disambiguating these cases is rather interesting. The `?1?` in a local scope pattern is actually a special case of the more general pattern of `? + <encoded number> + ?`, where "encoded number" can itself have embedded `@` symbols, which is a common delimeter in mangled names. So we have to take care during the disambiguation, which is the reason for the overly complicated `isLocalScopePattern` function in this patch. I've added some pretty obnoxious tests to exercise all of this, which exposed several other problems related to back-referencing, so those are fixed here as well. Finally, I've uncommented some tests that were previously marked as `FIXME`, since now these work. Differential Revision: https://reviews.llvm.org/D49965 llvm-svn: 338226
2018-07-30 11:12:34 +08:00
StringView dropBack(size_t N = 1) const {
if (N >= size())
N = size();
return StringView(First, Last - N);
}
char front() const {
assert(!empty());
return *begin();
}
[MS Demangler] Demangle symbols in function scopes. There are a couple of issues you run into when you start getting into more complex names, especially with regards to function local statics. When you've got something like: int x() { static int n = 0; return n; } Then this needs to demangle to something like int `int __cdecl x()'::`1'::n The nested mangled symbols (e.g. `int __cdecl x()` in the above example) also share state with regards to back-referencing, so we need to be able to re-use the demangler in the middle of demangling a symbol while sharing back-ref state. To make matters more complicated, there are a lot of ambiguities when demangling a symbol's qualified name, because a function local scope pattern (usually something like `?1??name?`) looks suspiciously like many other possible things that can occur, such as `?1` meaning the second back-ref and disambiguating these cases is rather interesting. The `?1?` in a local scope pattern is actually a special case of the more general pattern of `? + <encoded number> + ?`, where "encoded number" can itself have embedded `@` symbols, which is a common delimeter in mangled names. So we have to take care during the disambiguation, which is the reason for the overly complicated `isLocalScopePattern` function in this patch. I've added some pretty obnoxious tests to exercise all of this, which exposed several other problems related to back-referencing, so those are fixed here as well. Finally, I've uncommented some tests that were previously marked as `FIXME`, since now these work. Differential Revision: https://reviews.llvm.org/D49965 llvm-svn: 338226
2018-07-30 11:12:34 +08:00
char back() const {
assert(!empty());
return *(end() - 1);
}
char popFront() {
assert(!empty());
return *First++;
}
bool consumeFront(char C) {
if (!startsWith(C))
return false;
*this = dropFront(1);
return true;
}
bool consumeFront(StringView S) {
if (!startsWith(S))
return false;
*this = dropFront(S.size());
return true;
}
bool startsWith(char C) const { return !empty() && *begin() == C; }
bool startsWith(StringView Str) const {
if (Str.size() > size())
return false;
return std::equal(Str.begin(), Str.end(), begin());
}
const char &operator[](size_t Idx) const { return *(begin() + Idx); }
const char *begin() const { return First; }
const char *end() const { return Last; }
size_t size() const { return static_cast<size_t>(Last - First); }
bool empty() const { return First == Last; }
};
inline bool operator==(const StringView &LHS, const StringView &RHS) {
return LHS.size() == RHS.size() &&
std::equal(LHS.begin(), LHS.end(), RHS.begin());
}
#endif