[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
//===--- Index.cpp -----------------------------------------------*- C++-*-===//
|
|
|
|
//
|
|
|
|
// The LLVM Compiler Infrastructure
|
|
|
|
//
|
|
|
|
// This file is distributed under the University of Illinois Open Source
|
|
|
|
// License. See LICENSE.TXT for details.
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
|
|
|
|
#include "Index.h"
|
2017-12-14 20:17:14 +08:00
|
|
|
#include "llvm/ADT/StringExtras.h"
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
#include "llvm/ADT/StringRef.h"
|
2017-12-15 05:22:03 +08:00
|
|
|
#include "llvm/Support/SHA1.h"
|
2018-02-09 22:42:01 +08:00
|
|
|
#include "llvm/Support/raw_ostream.h"
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
|
|
|
|
namespace clang {
|
|
|
|
namespace clangd {
|
2017-12-24 03:38:03 +08:00
|
|
|
using namespace llvm;
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
|
2018-02-09 22:42:01 +08:00
|
|
|
raw_ostream &operator<<(raw_ostream &OS, const SymbolLocation &L) {
|
|
|
|
if (!L)
|
|
|
|
return OS << "(none)";
|
2018-04-13 16:30:39 +08:00
|
|
|
return OS << L.FileURI << "[" << L.Start.Line << ":" << L.Start.Column << "-"
|
|
|
|
<< L.End.Line << ":" << L.End.Column << ")";
|
2018-02-09 22:42:01 +08:00
|
|
|
}
|
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
SymbolID::SymbolID(StringRef USR)
|
|
|
|
: HashValue(SHA1::hash(arrayRefFromStringRef(USR))) {}
|
2017-12-14 20:17:14 +08:00
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
raw_ostream &operator<<(raw_ostream &OS, const SymbolID &ID) {
|
|
|
|
OS << toHex(toStringRef(ID.HashValue));
|
2017-12-14 20:17:14 +08:00
|
|
|
return OS;
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
}
|
|
|
|
|
2018-04-25 23:27:09 +08:00
|
|
|
std::string SymbolID::str() const {
|
|
|
|
std::string ID;
|
|
|
|
llvm::raw_string_ostream OS(ID);
|
|
|
|
OS << *this;
|
|
|
|
return OS.str();
|
|
|
|
}
|
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
void operator>>(StringRef Str, SymbolID &ID) {
|
2017-12-14 20:17:14 +08:00
|
|
|
std::string HexString = fromHex(Str);
|
2017-12-22 04:11:46 +08:00
|
|
|
assert(HexString.size() == ID.HashValue.size());
|
2017-12-14 20:17:14 +08:00
|
|
|
std::copy(HexString.begin(), HexString.end(), ID.HashValue.begin());
|
|
|
|
}
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
|
2018-07-05 14:20:41 +08:00
|
|
|
raw_ostream &operator<<(raw_ostream &OS, SymbolOrigin O) {
|
|
|
|
if (O == SymbolOrigin::Unknown)
|
|
|
|
return OS << "unknown";
|
|
|
|
constexpr static char Sigils[] = "ADSM4567";
|
|
|
|
for (unsigned I = 0; I < sizeof(Sigils); ++I)
|
2018-07-06 19:50:49 +08:00
|
|
|
if (static_cast<uint8_t>(O) & 1u << I)
|
2018-07-05 14:20:41 +08:00
|
|
|
OS << Sigils[I];
|
|
|
|
return OS;
|
|
|
|
}
|
|
|
|
|
2018-02-09 22:42:01 +08:00
|
|
|
raw_ostream &operator<<(raw_ostream &OS, const Symbol &S) {
|
|
|
|
return OS << S.Scope << S.Name;
|
|
|
|
}
|
|
|
|
|
2018-05-03 22:53:02 +08:00
|
|
|
double quality(const Symbol &S) {
|
|
|
|
// This avoids a sharp gradient for tail symbols, and also neatly avoids the
|
|
|
|
// question of whether 0 references means a bad symbol or missing data.
|
|
|
|
if (S.References < 3)
|
|
|
|
return 1;
|
|
|
|
return std::log(S.References);
|
|
|
|
}
|
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
SymbolSlab::const_iterator SymbolSlab::find(const SymbolID &ID) const {
|
|
|
|
auto It = std::lower_bound(Symbols.begin(), Symbols.end(), ID,
|
|
|
|
[](const Symbol &S, const SymbolID &I) {
|
|
|
|
return S.ID < I;
|
|
|
|
});
|
|
|
|
if (It != Symbols.end() && It->ID == ID)
|
|
|
|
return It;
|
|
|
|
return Symbols.end();
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
}
|
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
// Copy the underlying data of the symbol into the owned arena.
|
2018-08-20 17:47:12 +08:00
|
|
|
static void own(Symbol &S, llvm::UniqueStringSaver &Strings,
|
2017-12-24 03:38:03 +08:00
|
|
|
BumpPtrAllocator &Arena) {
|
|
|
|
// Intern replaces V with a reference to the same string owned by the arena.
|
2018-08-20 17:47:12 +08:00
|
|
|
auto Intern = [&](StringRef &V) { V = Strings.save(V); };
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
// We need to copy every StringRef field onto the arena.
|
|
|
|
Intern(S.Name);
|
|
|
|
Intern(S.Scope);
|
2018-02-07 00:10:35 +08:00
|
|
|
Intern(S.CanonicalDeclaration.FileURI);
|
2018-02-09 22:42:01 +08:00
|
|
|
Intern(S.Definition.FileURI);
|
2018-01-10 01:32:00 +08:00
|
|
|
|
2018-06-23 00:11:35 +08:00
|
|
|
Intern(S.Signature);
|
|
|
|
Intern(S.CompletionSnippetSuffix);
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
|
2018-08-31 21:55:01 +08:00
|
|
|
Intern(S.Documentation);
|
|
|
|
Intern(S.ReturnType);
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
for (auto &I : S.IncludeHeaders)
|
|
|
|
Intern(I.IncludeHeader);
|
2017-12-24 03:38:03 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
void SymbolSlab::Builder::insert(const Symbol &S) {
|
|
|
|
auto R = SymbolIndex.try_emplace(S.ID, Symbols.size());
|
|
|
|
if (R.second) {
|
|
|
|
Symbols.push_back(S);
|
2018-08-20 17:47:12 +08:00
|
|
|
own(Symbols.back(), UniqueStrings, Arena);
|
2017-12-24 03:38:03 +08:00
|
|
|
} else {
|
|
|
|
auto &Copy = Symbols[R.first->second] = S;
|
2018-08-20 17:47:12 +08:00
|
|
|
own(Copy, UniqueStrings, Arena);
|
2017-12-24 03:38:03 +08:00
|
|
|
}
|
|
|
|
}
|
2017-12-21 22:58:44 +08:00
|
|
|
|
2017-12-24 03:38:03 +08:00
|
|
|
SymbolSlab SymbolSlab::Builder::build() && {
|
|
|
|
Symbols = {Symbols.begin(), Symbols.end()}; // Force shrink-to-fit.
|
|
|
|
// Sort symbols so the slab can binary search over them.
|
|
|
|
std::sort(Symbols.begin(), Symbols.end(),
|
|
|
|
[](const Symbol &L, const Symbol &R) { return L.ID < R.ID; });
|
|
|
|
// We may have unused strings from overwritten symbols. Build a new arena.
|
|
|
|
BumpPtrAllocator NewArena;
|
2018-08-20 17:47:12 +08:00
|
|
|
llvm::UniqueStringSaver Strings(NewArena);
|
2017-12-24 03:38:03 +08:00
|
|
|
for (auto &S : Symbols)
|
|
|
|
own(S, Strings, NewArena);
|
|
|
|
return SymbolSlab(std::move(NewArena), std::move(Symbols));
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
}
|
|
|
|
|
2018-08-31 20:54:13 +08:00
|
|
|
raw_ostream &operator<<(raw_ostream &OS, SymbolOccurrenceKind K) {
|
|
|
|
if (K == SymbolOccurrenceKind::Unknown)
|
|
|
|
return OS << "Unknown";
|
|
|
|
static const std::vector<const char *> Messages = {"Decl", "Def", "Ref"};
|
|
|
|
bool VisitedOnce = false;
|
|
|
|
for (unsigned I = 0; I < Messages.size(); ++I) {
|
|
|
|
if (static_cast<uint8_t>(K) & 1u << I) {
|
|
|
|
if (VisitedOnce)
|
|
|
|
OS << ", ";
|
|
|
|
OS << Messages[I];
|
|
|
|
VisitedOnce = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return OS;
|
|
|
|
}
|
|
|
|
|
|
|
|
llvm::raw_ostream &operator<<(llvm::raw_ostream &OS,
|
|
|
|
const SymbolOccurrence &Occurrence) {
|
|
|
|
OS << Occurrence.Location << ":" << Occurrence.Kind;
|
|
|
|
return OS;
|
|
|
|
}
|
|
|
|
|
|
|
|
void SymbolOccurrenceSlab::insert(const SymbolID &SymID,
|
|
|
|
const SymbolOccurrence &Occurrence) {
|
|
|
|
assert(!Frozen &&
|
|
|
|
"Can't insert a symbol occurrence after the slab has been frozen!");
|
|
|
|
auto &SymOccurrences = Occurrences[SymID];
|
|
|
|
SymOccurrences.push_back(Occurrence);
|
|
|
|
SymOccurrences.back().Location.FileURI =
|
|
|
|
UniqueStrings.save(Occurrence.Location.FileURI);
|
|
|
|
}
|
|
|
|
|
|
|
|
void SymbolOccurrenceSlab::freeze() {
|
2018-09-01 15:47:03 +08:00
|
|
|
// Deduplicate symbol occurrences.
|
2018-08-31 20:54:13 +08:00
|
|
|
for (auto &IDAndOccurrence : Occurrences) {
|
|
|
|
auto &Occurrence = IDAndOccurrence.getSecond();
|
|
|
|
std::sort(Occurrence.begin(), Occurrence.end());
|
|
|
|
Occurrence.erase(std::unique(Occurrence.begin(), Occurrence.end()),
|
|
|
|
Occurrence.end());
|
|
|
|
}
|
|
|
|
Frozen = true;
|
|
|
|
}
|
|
|
|
|
[clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
information of a C++ symbol needed by clangd. clangd will use it in clangd's
AST/dynamic index and global/static index (code completion and code
navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
symbols when the source file is changed (for ASTIndex), or to generate
all C++ symbols for the whole project.
In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.
Reviewers: ioeric, sammccall, ilya-biryukov, malaperle
Reviewed By: sammccall
Subscribers: malaperle, klimek, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D40897
llvm-svn: 320486
2017-12-12 23:42:10 +08:00
|
|
|
} // namespace clangd
|
|
|
|
} // namespace clang
|