2019-03-01 17:52:53 +08:00
|
|
|
//===-- YAMLSerialization.cpp ------------------------------------*- C++-*-===//
|
2017-12-14 20:17:14 +08:00
|
|
|
//
|
2019-01-19 16:50:56 +08:00
|
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
2017-12-14 20:17:14 +08:00
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
2018-10-04 22:09:55 +08:00
|
|
|
//
|
|
|
|
// A YAML index file is a sequence of tagged entries.
|
|
|
|
// Each entry either encodes a Symbol or the list of references to a symbol
|
|
|
|
// (a "ref bundle").
|
|
|
|
//
|
|
|
|
//===----------------------------------------------------------------------===//
|
2017-12-14 20:17:14 +08:00
|
|
|
|
|
|
|
#include "Index.h"
|
2019-06-03 13:07:52 +08:00
|
|
|
#include "Relation.h"
|
[clangd] Define a compact binary serialization fomat for symbol slab/index.
Summary:
This is intended to replace the current YAML format for general use.
It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML:
llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M
It's also simpler/faster to read and write.
The format is a RIFF container (chunks of (type, size, data)) with:
- a compressed string table
- simple binary encoding of symbols (with varints for compactness)
It can be extended to include occurrences, Dex posting lists, etc.
There's no rich backwards-compatibility scheme, but a version number is included
so we can detect incompatible files and do ad-hoc back-compat.
Alternatives considered:
- compressed YAML or JSON: bulky and slow to load
- llvm bitstream: confusing model and libraries are hard to use. My attempt
produced slightly larger files, and the code was longer and slower.
- protobuf or similar: would be really nice (esp for back-compat) but the
dependency is a big hassle
- ad-hoc binary format without a container: it seems clear we're going
to add posting lists and occurrences here, and that they will benefit
from sharing a string table. The container makes it easy to debug
these pieces in isolation, and make them optional.
Reviewers: ioeric
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51585
llvm-svn: 341375
2018-09-05 00:16:50 +08:00
|
|
|
#include "Serialization.h"
|
2019-02-28 19:02:01 +08:00
|
|
|
#include "SymbolLocation.h"
|
2019-02-28 20:31:49 +08:00
|
|
|
#include "SymbolOrigin.h"
|
2018-09-07 17:40:36 +08:00
|
|
|
#include "Trace.h"
|
2018-09-10 16:23:53 +08:00
|
|
|
#include "dex/Dex.h"
|
2018-01-10 01:32:00 +08:00
|
|
|
#include "llvm/ADT/Optional.h"
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
#include "llvm/ADT/SmallVector.h"
|
[clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).
Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.
For LLVM-project binary index file, we save ~13% memory.
| Before | After |
| 412MB | 355MB |
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53363
llvm-svn: 344735
2018-10-18 18:43:50 +08:00
|
|
|
#include "llvm/ADT/StringRef.h"
|
2018-11-14 19:55:45 +08:00
|
|
|
#include "llvm/Support/Allocator.h"
|
2017-12-15 05:22:03 +08:00
|
|
|
#include "llvm/Support/Errc.h"
|
2017-12-14 20:17:14 +08:00
|
|
|
#include "llvm/Support/MemoryBuffer.h"
|
2018-11-14 19:55:45 +08:00
|
|
|
#include "llvm/Support/StringSaver.h"
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
#include "llvm/Support/YAMLTraits.h"
|
2017-12-14 20:17:14 +08:00
|
|
|
#include "llvm/Support/raw_ostream.h"
|
2018-09-07 02:52:26 +08:00
|
|
|
#include <cstdint>
|
2017-12-14 20:17:14 +08:00
|
|
|
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
LLVM_YAML_IS_SEQUENCE_VECTOR(clang::clangd::Symbol::IncludeHeaderWithReferences)
|
2018-10-04 22:09:55 +08:00
|
|
|
LLVM_YAML_IS_SEQUENCE_VECTOR(clang::clangd::Ref)
|
2017-12-14 20:17:14 +08:00
|
|
|
|
2018-10-04 22:09:55 +08:00
|
|
|
namespace {
|
|
|
|
using RefBundle =
|
|
|
|
std::pair<clang::clangd::SymbolID, std::vector<clang::clangd::Ref>>;
|
2019-06-03 13:07:52 +08:00
|
|
|
// This is a pale imitation of std::variant<Symbol, RefBundle, Relation>
|
2018-10-04 22:09:55 +08:00
|
|
|
struct VariantEntry {
|
2019-01-07 23:45:19 +08:00
|
|
|
llvm::Optional<clang::clangd::Symbol> Symbol;
|
|
|
|
llvm::Optional<RefBundle> Refs;
|
2019-06-03 13:07:52 +08:00
|
|
|
llvm::Optional<clang::clangd::Relation> Relation;
|
2018-10-04 22:09:55 +08:00
|
|
|
};
|
[clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).
Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.
For LLVM-project binary index file, we save ~13% memory.
| Before | After |
| 412MB | 355MB |
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53363
llvm-svn: 344735
2018-10-18 18:43:50 +08:00
|
|
|
// A class helps YAML to serialize the 32-bit encoded position (Line&Column),
|
|
|
|
// as YAMLIO can't directly map bitfields.
|
|
|
|
struct YPosition {
|
|
|
|
uint32_t Line;
|
|
|
|
uint32_t Column;
|
|
|
|
};
|
|
|
|
|
2018-10-04 22:09:55 +08:00
|
|
|
} // namespace
|
2017-12-14 20:17:14 +08:00
|
|
|
namespace llvm {
|
|
|
|
namespace yaml {
|
|
|
|
|
2018-10-04 22:09:55 +08:00
|
|
|
using clang::clangd::Ref;
|
|
|
|
using clang::clangd::RefKind;
|
2019-06-03 13:07:52 +08:00
|
|
|
using clang::clangd::Relation;
|
|
|
|
using clang::clangd::RelationKind;
|
2017-12-14 20:17:14 +08:00
|
|
|
using clang::clangd::Symbol;
|
|
|
|
using clang::clangd::SymbolID;
|
|
|
|
using clang::clangd::SymbolLocation;
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
using clang::clangd::SymbolOrigin;
|
2017-12-14 20:17:14 +08:00
|
|
|
using clang::index::SymbolInfo;
|
|
|
|
using clang::index::SymbolKind;
|
2018-09-04 23:10:40 +08:00
|
|
|
using clang::index::SymbolLanguage;
|
2019-06-03 13:07:52 +08:00
|
|
|
using clang::index::SymbolRole;
|
2017-12-14 20:17:14 +08:00
|
|
|
|
|
|
|
// Helper to (de)serialize the SymbolID. We serialize it as a hex string.
|
|
|
|
struct NormalizedSymbolID {
|
|
|
|
NormalizedSymbolID(IO &) {}
|
2018-09-04 23:10:40 +08:00
|
|
|
NormalizedSymbolID(IO &, const SymbolID &ID) {
|
2019-01-07 23:45:19 +08:00
|
|
|
llvm::raw_string_ostream OS(HexString);
|
2017-12-14 20:17:14 +08:00
|
|
|
OS << ID;
|
|
|
|
}
|
|
|
|
|
2018-09-19 03:00:59 +08:00
|
|
|
SymbolID denormalize(IO &I) {
|
|
|
|
auto ID = SymbolID::fromStr(HexString);
|
|
|
|
if (!ID) {
|
2019-01-07 23:45:19 +08:00
|
|
|
I.setError(llvm::toString(ID.takeError()));
|
2018-09-19 03:00:59 +08:00
|
|
|
return SymbolID();
|
|
|
|
}
|
|
|
|
return *ID;
|
2017-12-14 20:17:14 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
std::string HexString;
|
|
|
|
};
|
|
|
|
|
2018-09-07 02:52:26 +08:00
|
|
|
struct NormalizedSymbolFlag {
|
|
|
|
NormalizedSymbolFlag(IO &) {}
|
|
|
|
NormalizedSymbolFlag(IO &, Symbol::SymbolFlag F) {
|
|
|
|
Flag = static_cast<uint8_t>(F);
|
|
|
|
}
|
|
|
|
|
|
|
|
Symbol::SymbolFlag denormalize(IO &) {
|
|
|
|
return static_cast<Symbol::SymbolFlag>(Flag);
|
|
|
|
}
|
|
|
|
|
|
|
|
uint8_t Flag = 0;
|
|
|
|
};
|
|
|
|
|
2018-09-21 21:04:57 +08:00
|
|
|
struct NormalizedSymbolOrigin {
|
|
|
|
NormalizedSymbolOrigin(IO &) {}
|
|
|
|
NormalizedSymbolOrigin(IO &, SymbolOrigin O) {
|
|
|
|
Origin = static_cast<uint8_t>(O);
|
|
|
|
}
|
|
|
|
|
|
|
|
SymbolOrigin denormalize(IO &) { return static_cast<SymbolOrigin>(Origin); }
|
|
|
|
|
|
|
|
uint8_t Origin = 0;
|
|
|
|
};
|
|
|
|
|
[clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).
Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.
For LLVM-project binary index file, we save ~13% memory.
| Before | After |
| 412MB | 355MB |
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53363
llvm-svn: 344735
2018-10-18 18:43:50 +08:00
|
|
|
template <> struct MappingTraits<YPosition> {
|
|
|
|
static void mapping(IO &IO, YPosition &Value) {
|
2018-04-13 16:30:39 +08:00
|
|
|
IO.mapRequired("Line", Value.Line);
|
|
|
|
IO.mapRequired("Column", Value.Column);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
[clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).
Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.
For LLVM-project binary index file, we save ~13% memory.
| Before | After |
| 412MB | 355MB |
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53363
llvm-svn: 344735
2018-10-18 18:43:50 +08:00
|
|
|
struct NormalizedPosition {
|
|
|
|
using Position = clang::clangd::SymbolLocation::Position;
|
|
|
|
NormalizedPosition(IO &) {}
|
|
|
|
NormalizedPosition(IO &, const Position &Pos) {
|
|
|
|
P.Line = Pos.line();
|
|
|
|
P.Column = Pos.column();
|
|
|
|
}
|
|
|
|
|
|
|
|
Position denormalize(IO &) {
|
|
|
|
Position Pos;
|
|
|
|
Pos.setLine(P.Line);
|
|
|
|
Pos.setColumn(P.Column);
|
|
|
|
return Pos;
|
|
|
|
}
|
|
|
|
YPosition P;
|
|
|
|
};
|
|
|
|
|
2018-11-14 19:55:45 +08:00
|
|
|
struct NormalizedFileURI {
|
|
|
|
NormalizedFileURI(IO &) {}
|
|
|
|
NormalizedFileURI(IO &, const char *FileURI) { URI = FileURI; }
|
|
|
|
|
|
|
|
const char *denormalize(IO &IO) {
|
|
|
|
assert(IO.getContext() &&
|
|
|
|
"Expecting an UniqueStringSaver to allocate data");
|
|
|
|
return static_cast<llvm::UniqueStringSaver *>(IO.getContext())
|
|
|
|
->save(URI)
|
|
|
|
.data();
|
|
|
|
}
|
|
|
|
|
|
|
|
std::string URI;
|
|
|
|
};
|
|
|
|
|
2017-12-14 20:17:14 +08:00
|
|
|
template <> struct MappingTraits<SymbolLocation> {
|
|
|
|
static void mapping(IO &IO, SymbolLocation &Value) {
|
2018-11-14 19:55:45 +08:00
|
|
|
MappingNormalization<NormalizedFileURI, const char *> NFile(IO,
|
|
|
|
Value.FileURI);
|
|
|
|
IO.mapRequired("FileURI", NFile->URI);
|
[clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).
Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.
For LLVM-project binary index file, we save ~13% memory.
| Before | After |
| 412MB | 355MB |
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53363
llvm-svn: 344735
2018-10-18 18:43:50 +08:00
|
|
|
MappingNormalization<NormalizedPosition, SymbolLocation::Position> NStart(
|
|
|
|
IO, Value.Start);
|
|
|
|
IO.mapRequired("Start", NStart->P);
|
|
|
|
MappingNormalization<NormalizedPosition, SymbolLocation::Position> NEnd(
|
|
|
|
IO, Value.End);
|
|
|
|
IO.mapRequired("End", NEnd->P);
|
2017-12-14 20:17:14 +08:00
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
template <> struct MappingTraits<SymbolInfo> {
|
|
|
|
static void mapping(IO &io, SymbolInfo &SymInfo) {
|
|
|
|
// FIXME: expose other fields?
|
|
|
|
io.mapRequired("Kind", SymInfo.Kind);
|
|
|
|
io.mapRequired("Lang", SymInfo.Lang);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
template <>
|
|
|
|
struct MappingTraits<clang::clangd::Symbol::IncludeHeaderWithReferences> {
|
|
|
|
static void mapping(IO &io,
|
|
|
|
clang::clangd::Symbol::IncludeHeaderWithReferences &Inc) {
|
|
|
|
io.mapRequired("Header", Inc.IncludeHeader);
|
|
|
|
io.mapRequired("References", Inc.References);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2018-01-10 01:32:00 +08:00
|
|
|
template <> struct MappingTraits<Symbol> {
|
2017-12-14 20:17:14 +08:00
|
|
|
static void mapping(IO &IO, Symbol &Sym) {
|
2018-01-10 01:32:00 +08:00
|
|
|
MappingNormalization<NormalizedSymbolID, SymbolID> NSymbolID(IO, Sym.ID);
|
2018-09-07 02:52:26 +08:00
|
|
|
MappingNormalization<NormalizedSymbolFlag, Symbol::SymbolFlag> NSymbolFlag(
|
|
|
|
IO, Sym.Flags);
|
2018-09-21 21:04:57 +08:00
|
|
|
MappingNormalization<NormalizedSymbolOrigin, SymbolOrigin> NSymbolOrigin(
|
|
|
|
IO, Sym.Origin);
|
2017-12-14 20:17:14 +08:00
|
|
|
IO.mapRequired("ID", NSymbolID->HexString);
|
2017-12-19 19:37:40 +08:00
|
|
|
IO.mapRequired("Name", Sym.Name);
|
|
|
|
IO.mapRequired("Scope", Sym.Scope);
|
2017-12-14 20:17:14 +08:00
|
|
|
IO.mapRequired("SymInfo", Sym.SymInfo);
|
2018-02-09 22:42:01 +08:00
|
|
|
IO.mapOptional("CanonicalDeclaration", Sym.CanonicalDeclaration,
|
|
|
|
SymbolLocation());
|
|
|
|
IO.mapOptional("Definition", Sym.Definition, SymbolLocation());
|
2018-03-12 22:49:09 +08:00
|
|
|
IO.mapOptional("References", Sym.References, 0u);
|
2018-09-21 21:04:57 +08:00
|
|
|
IO.mapOptional("Origin", NSymbolOrigin->Origin);
|
2018-09-07 02:52:26 +08:00
|
|
|
IO.mapOptional("Flags", NSymbolFlag->Flag);
|
2018-06-23 00:11:35 +08:00
|
|
|
IO.mapOptional("Signature", Sym.Signature);
|
2019-04-12 18:09:24 +08:00
|
|
|
IO.mapOptional("TemplateSpecializationArgs",
|
|
|
|
Sym.TemplateSpecializationArgs);
|
2018-06-23 00:11:35 +08:00
|
|
|
IO.mapOptional("CompletionSnippetSuffix", Sym.CompletionSnippetSuffix);
|
2018-08-31 21:55:01 +08:00
|
|
|
IO.mapOptional("Documentation", Sym.Documentation);
|
|
|
|
IO.mapOptional("ReturnType", Sym.ReturnType);
|
2018-11-26 23:29:14 +08:00
|
|
|
IO.mapOptional("Type", Sym.Type);
|
[clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
2018-09-03 18:18:21 +08:00
|
|
|
IO.mapOptional("IncludeHeaders", Sym.IncludeHeaders);
|
2017-12-14 20:17:14 +08:00
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
template <> struct ScalarEnumerationTraits<SymbolLanguage> {
|
|
|
|
static void enumeration(IO &IO, SymbolLanguage &Value) {
|
|
|
|
IO.enumCase(Value, "C", SymbolLanguage::C);
|
|
|
|
IO.enumCase(Value, "Cpp", SymbolLanguage::CXX);
|
|
|
|
IO.enumCase(Value, "ObjC", SymbolLanguage::ObjC);
|
|
|
|
IO.enumCase(Value, "Swift", SymbolLanguage::Swift);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
template <> struct ScalarEnumerationTraits<SymbolKind> {
|
|
|
|
static void enumeration(IO &IO, SymbolKind &Value) {
|
|
|
|
#define DEFINE_ENUM(name) IO.enumCase(Value, #name, SymbolKind::name)
|
|
|
|
|
|
|
|
DEFINE_ENUM(Unknown);
|
|
|
|
DEFINE_ENUM(Function);
|
|
|
|
DEFINE_ENUM(Module);
|
|
|
|
DEFINE_ENUM(Namespace);
|
|
|
|
DEFINE_ENUM(NamespaceAlias);
|
|
|
|
DEFINE_ENUM(Macro);
|
|
|
|
DEFINE_ENUM(Enum);
|
|
|
|
DEFINE_ENUM(Struct);
|
|
|
|
DEFINE_ENUM(Class);
|
|
|
|
DEFINE_ENUM(Protocol);
|
|
|
|
DEFINE_ENUM(Extension);
|
|
|
|
DEFINE_ENUM(Union);
|
|
|
|
DEFINE_ENUM(TypeAlias);
|
|
|
|
DEFINE_ENUM(Function);
|
|
|
|
DEFINE_ENUM(Variable);
|
|
|
|
DEFINE_ENUM(Field);
|
|
|
|
DEFINE_ENUM(EnumConstant);
|
|
|
|
DEFINE_ENUM(InstanceMethod);
|
|
|
|
DEFINE_ENUM(ClassMethod);
|
|
|
|
DEFINE_ENUM(StaticMethod);
|
|
|
|
DEFINE_ENUM(InstanceProperty);
|
|
|
|
DEFINE_ENUM(ClassProperty);
|
|
|
|
DEFINE_ENUM(StaticProperty);
|
|
|
|
DEFINE_ENUM(Constructor);
|
|
|
|
DEFINE_ENUM(Destructor);
|
|
|
|
DEFINE_ENUM(ConversionFunction);
|
|
|
|
DEFINE_ENUM(Parameter);
|
|
|
|
DEFINE_ENUM(Using);
|
|
|
|
|
|
|
|
#undef DEFINE_ENUM
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2018-10-04 22:09:55 +08:00
|
|
|
template <> struct MappingTraits<RefBundle> {
|
|
|
|
static void mapping(IO &IO, RefBundle &Refs) {
|
|
|
|
MappingNormalization<NormalizedSymbolID, SymbolID> NSymbolID(IO,
|
|
|
|
Refs.first);
|
|
|
|
IO.mapRequired("ID", NSymbolID->HexString);
|
|
|
|
IO.mapRequired("References", Refs.second);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
struct NormalizedRefKind {
|
|
|
|
NormalizedRefKind(IO &) {}
|
|
|
|
NormalizedRefKind(IO &, RefKind O) { Kind = static_cast<uint8_t>(O); }
|
|
|
|
|
|
|
|
RefKind denormalize(IO &) { return static_cast<RefKind>(Kind); }
|
|
|
|
|
|
|
|
uint8_t Kind = 0;
|
|
|
|
};
|
|
|
|
|
|
|
|
template <> struct MappingTraits<Ref> {
|
|
|
|
static void mapping(IO &IO, Ref &R) {
|
|
|
|
MappingNormalization<NormalizedRefKind, RefKind> NKind(IO, R.Kind);
|
|
|
|
IO.mapRequired("Kind", NKind->Kind);
|
|
|
|
IO.mapRequired("Location", R.Location);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2019-06-03 13:07:52 +08:00
|
|
|
struct NormalizedSymbolRole {
|
|
|
|
NormalizedSymbolRole(IO &) {}
|
|
|
|
NormalizedSymbolRole(IO &IO, SymbolRole R) {
|
|
|
|
Kind = static_cast<uint8_t>(clang::clangd::symbolRoleToRelationKind(R));
|
|
|
|
}
|
|
|
|
|
|
|
|
SymbolRole denormalize(IO &IO) {
|
|
|
|
return clang::clangd::relationKindToSymbolRole(
|
|
|
|
static_cast<RelationKind>(Kind));
|
|
|
|
}
|
|
|
|
|
|
|
|
uint8_t Kind = 0;
|
|
|
|
};
|
|
|
|
|
|
|
|
template <> struct MappingTraits<SymbolID> {
|
|
|
|
static void mapping(IO &IO, SymbolID &ID) {
|
|
|
|
MappingNormalization<NormalizedSymbolID, SymbolID> NSymbolID(IO, ID);
|
|
|
|
IO.mapRequired("ID", NSymbolID->HexString);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
|
|
|
template <> struct MappingTraits<Relation> {
|
|
|
|
static void mapping(IO &IO, Relation &Relation) {
|
|
|
|
MappingNormalization<NormalizedSymbolRole, SymbolRole> NRole(
|
|
|
|
IO, Relation.Predicate);
|
|
|
|
IO.mapRequired("Subject", Relation.Subject);
|
|
|
|
IO.mapRequired("Predicate", NRole->Kind);
|
|
|
|
IO.mapRequired("Object", Relation.Object);
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2018-10-04 22:09:55 +08:00
|
|
|
template <> struct MappingTraits<VariantEntry> {
|
|
|
|
static void mapping(IO &IO, VariantEntry &Variant) {
|
|
|
|
if (IO.mapTag("!Symbol", Variant.Symbol.hasValue())) {
|
|
|
|
if (!IO.outputting())
|
|
|
|
Variant.Symbol.emplace();
|
|
|
|
MappingTraits<Symbol>::mapping(IO, *Variant.Symbol);
|
|
|
|
} else if (IO.mapTag("!Refs", Variant.Refs.hasValue())) {
|
|
|
|
if (!IO.outputting())
|
|
|
|
Variant.Refs.emplace();
|
|
|
|
MappingTraits<RefBundle>::mapping(IO, *Variant.Refs);
|
2019-06-03 13:07:52 +08:00
|
|
|
} else if (IO.mapTag("!Relations", Variant.Relation.hasValue())) {
|
|
|
|
if (!IO.outputting())
|
|
|
|
Variant.Relation.emplace();
|
|
|
|
MappingTraits<Relation>::mapping(IO, *Variant.Relation);
|
2018-10-04 22:09:55 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
};
|
|
|
|
|
2017-12-14 20:17:14 +08:00
|
|
|
} // namespace yaml
|
|
|
|
} // namespace llvm
|
|
|
|
|
|
|
|
namespace clang {
|
|
|
|
namespace clangd {
|
|
|
|
|
2019-01-07 23:45:19 +08:00
|
|
|
void writeYAML(const IndexFileOut &O, llvm::raw_ostream &OS) {
|
|
|
|
llvm::yaml::Output Yout(OS);
|
2018-10-04 22:09:55 +08:00
|
|
|
for (const auto &Sym : *O.Symbols) {
|
|
|
|
VariantEntry Entry;
|
|
|
|
Entry.Symbol = Sym;
|
|
|
|
Yout << Entry;
|
|
|
|
}
|
|
|
|
if (O.Refs)
|
|
|
|
for (auto &Sym : *O.Refs) {
|
|
|
|
VariantEntry Entry;
|
|
|
|
Entry.Refs = Sym;
|
|
|
|
Yout << Entry;
|
|
|
|
}
|
2019-06-03 13:07:52 +08:00
|
|
|
if (O.Relations)
|
|
|
|
for (auto &R : *O.Relations) {
|
|
|
|
VariantEntry Entry;
|
|
|
|
Entry.Relation = R;
|
|
|
|
Yout << Entry;
|
|
|
|
}
|
2017-12-14 20:17:14 +08:00
|
|
|
}
|
|
|
|
|
2019-01-07 23:45:19 +08:00
|
|
|
llvm::Expected<IndexFileIn> readYAML(llvm::StringRef Data) {
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
SymbolSlab::Builder Symbols;
|
2018-10-04 22:09:55 +08:00
|
|
|
RefSlab::Builder Refs;
|
2019-06-03 13:07:52 +08:00
|
|
|
RelationSlab::Builder Relations;
|
2019-01-07 23:45:19 +08:00
|
|
|
llvm::BumpPtrAllocator
|
|
|
|
Arena; // store the underlying data of Position::FileURI.
|
|
|
|
llvm::UniqueStringSaver Strings(Arena);
|
|
|
|
llvm::yaml::Input Yin(Data, &Strings);
|
2019-01-08 23:24:47 +08:00
|
|
|
while (Yin.setCurrentDocument()) {
|
|
|
|
llvm::yaml::EmptyContext Ctx;
|
2018-10-04 22:09:55 +08:00
|
|
|
VariantEntry Variant;
|
2019-01-08 23:24:47 +08:00
|
|
|
yamlize(Yin, Variant, true, Ctx);
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
if (Yin.error())
|
2019-01-07 23:45:19 +08:00
|
|
|
return llvm::errorCodeToError(Yin.error());
|
2019-01-08 23:24:47 +08:00
|
|
|
|
2018-10-04 22:09:55 +08:00
|
|
|
if (Variant.Symbol)
|
|
|
|
Symbols.insert(*Variant.Symbol);
|
|
|
|
if (Variant.Refs)
|
|
|
|
for (const auto &Ref : Variant.Refs->second)
|
|
|
|
Refs.insert(Variant.Refs->first, Ref);
|
2019-06-03 13:07:52 +08:00
|
|
|
if (Variant.Relation)
|
|
|
|
Relations.insert(*Variant.Relation);
|
2019-01-08 23:24:47 +08:00
|
|
|
Yin.nextDocument();
|
|
|
|
}
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
|
|
|
|
IndexFileIn Result;
|
|
|
|
Result.Symbols.emplace(std::move(Symbols).build());
|
2018-10-04 22:09:55 +08:00
|
|
|
Result.Refs.emplace(std::move(Refs).build());
|
2019-06-03 13:07:52 +08:00
|
|
|
Result.Relations.emplace(std::move(Relations).build());
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
return std::move(Result);
|
2018-01-09 23:21:45 +08:00
|
|
|
}
|
|
|
|
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
std::string toYAML(const Symbol &S) {
|
|
|
|
std::string Buf;
|
|
|
|
{
|
2019-01-07 23:45:19 +08:00
|
|
|
llvm::raw_string_ostream OS(Buf);
|
|
|
|
llvm::yaml::Output Yout(OS);
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
Symbol Sym = S; // copy: Yout<< requires mutability.
|
2018-10-09 23:16:14 +08:00
|
|
|
Yout << Sym;
|
[clangd] Define a compact binary serialization fomat for symbol slab/index.
Summary:
This is intended to replace the current YAML format for general use.
It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML:
llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M
It's also simpler/faster to read and write.
The format is a RIFF container (chunks of (type, size, data)) with:
- a compressed string table
- simple binary encoding of symbols (with varints for compactness)
It can be extended to include occurrences, Dex posting lists, etc.
There's no rich backwards-compatibility scheme, but a version number is included
so we can detect incompatible files and do ad-hoc back-compat.
Alternatives considered:
- compressed YAML or JSON: bulky and slow to load
- llvm bitstream: confusing model and libraries are hard to use. My attempt
produced slightly larger files, and the code was longer and slower.
- protobuf or similar: would be really nice (esp for back-compat) but the
dependency is a big hassle
- ad-hoc binary format without a container: it seems clear we're going
to add posting lists and occurrences here, and that they will benefit
from sharing a string table. The container makes it easy to debug
these pieces in isolation, and make them optional.
Reviewers: ioeric
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51585
llvm-svn: 341375
2018-09-05 00:16:50 +08:00
|
|
|
}
|
[clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
- symbol -> YAML I expect to keep for tools like dexp
- YAML -> symbol is used for the MR-style indexer, I think we can eliminate
this (merge-on-the-fly, else use a different serialization)
Reviewers: kbobyrev
Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52453
llvm-svn: 342999
2018-09-26 02:06:43 +08:00
|
|
|
return Buf;
|
|
|
|
}
|
2018-09-04 23:10:40 +08:00
|
|
|
|
2019-01-07 23:45:19 +08:00
|
|
|
std::string toYAML(const std::pair<SymbolID, llvm::ArrayRef<Ref>> &Data) {
|
2018-10-04 22:09:55 +08:00
|
|
|
RefBundle Refs = {Data.first, Data.second};
|
|
|
|
std::string Buf;
|
|
|
|
{
|
2019-01-07 23:45:19 +08:00
|
|
|
llvm::raw_string_ostream OS(Buf);
|
|
|
|
llvm::yaml::Output Yout(OS);
|
2018-10-04 22:09:55 +08:00
|
|
|
Yout << Refs;
|
|
|
|
}
|
|
|
|
return Buf;
|
|
|
|
}
|
|
|
|
|
2019-06-03 13:07:52 +08:00
|
|
|
std::string toYAML(const Relation &R) {
|
|
|
|
std::string Buf;
|
|
|
|
{
|
|
|
|
llvm::raw_string_ostream OS(Buf);
|
|
|
|
llvm::yaml::Output Yout(OS);
|
|
|
|
Relation Rel = R; // copy: Yout<< requires mutability.
|
|
|
|
Yout << Rel;
|
|
|
|
}
|
|
|
|
return Buf;
|
|
|
|
}
|
|
|
|
|
2017-12-14 20:17:14 +08:00
|
|
|
} // namespace clangd
|
|
|
|
} // namespace clang
|