Summary:
Previously, we randomly pick one main file symbol in dynamic index, we
may loose the ideal symbol (with definition location) in the index.
It fixes the issue where sometimes we fail to go to the symbol definition, see:
1. call go-to-decl on Foo in Foo.cpp
2. jump to Foo.h, call go-to-def on Foo in Foo.h
we can't go back to Foo.cpp -- because we open Foo.cpp, Foo.h in clangd, both
files have Foo symbol (one with def&decl, one with decl only), we randomely
choose one.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63425
llvm-svn: 363568
Summary:
This builds on the relations support added in D59407, D62459,
and D62471 to expose relations in SymbolIndex and its
implementations.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62839
llvm-svn: 363481
Summary:
After rL360344, BackgroundIndex expects symbols with zero refcounts.
Therefore existing index files are no longer valid.
Assertion regarding finding target of a reference was wrong, since
background-index might still be going on or we might've loaded only some part of
the shards and might be missing the declaring shards for the symbol.
Reviewers: ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61734
llvm-svn: 360349
Summary:
For counting number of references clangd was relying on merging every
duplication of a symbol. Unfortunately this does not apply to FileIndex(and one
of its users' BackgroundIndex), since we get rid of duplication by simply
dropping symbols coming from non-canonical locations. So only one or two(coming
from canonical declaration header and defined source file, if exists)
replications of the same symbol reaches merging step.
This patch changes reference counting logic to rather count number of different
RefSlabs a given SymbolID exists.
Reviewers: ilya-biryukov
Subscribers: ioeric, MaskRay, jkorous, mgrang, arphaman, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59481
llvm-svn: 360344
Summary:
This is a further optimization of r350803, we drop docs in static index for
symbols not being indexed for completion, while keeping the docs in dynamic
index (we rely on dynamic index to get docs for class members).
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D56539
llvm-svn: 354792
Summary:
This enables include insertion by adding canonical includes into
preambledata.
Reviewers: ioeric, ilya-biryukov
Subscribers: javed.absar, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D57508
llvm-svn: 353054
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The new guideline is to qualify with 'llvm::' explicitly both in
'.h' and '.cpp' files. This simplifies moving the code between
header and source files and is easier to keep consistent.
llvm-svn: 350531
Summary:
Instead of passing around a list of supported URI schemes in clangd, we
expose an interface to convert a path to URI using any compatible scheme
that has been registered. It favors customized schemes and falls
back to "file" when no other scheme works.
Changes in this patch are:
- URI::create(AbsPath, URISchemes) -> URI::create(AbsPath). The new API finds a
compatible scheme from the registry.
- Remove URISchemes option everywhere (ClangdServer, SymbolCollecter, FileIndex etc).
- Unit tests will use "unittest" by default.
- Move "test" scheme from ClangdLSPServer to ClangdMain.cpp, and only
register the test scheme when lit-test or enable-lit-scheme is set.
(The new flag is added to make lit protocol.test work; I wonder if there
is alternative here.)
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54800
llvm-svn: 347467
Summary:
This allows us to deduplicate header symbols across TUs. File digests
are collects when collecting symbols/refs. And the index store deduplicates
file symbols based on the file digest.
Reviewers: sammccall, hokein
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53433
llvm-svn: 346221
Standardize on the most common namespace setup in our *.cpp files:
using namespace llvm;
namespace clang {
namespace clangd {
void foo(StringRef) { ... }
And remove redundant llvm:: qualifiers. (Except for cases like
make_unique where this causes problems with std:: and ADL).
This choice is pretty arbitrary, but some broad consistency is nice.
This is going to conflict with everything. Sorry :-/
Squash the other configurations:
A)
using namespace llvm;
using namespace clang;
using namespace clangd;
void clangd::foo(StringRef);
This is in some of the older files. (It prevents accidentally defining a
new function instead of one in the header file, for what that's worth).
B)
namespace clang {
namespace clangd {
void foo(llvm::StringRef) { ... }
This is fine, but in practice the using directive often gets added over time.
C)
namespace clang {
namespace clangd {
using namespace llvm; // inside the namespace
This was pretty common, but is a bit misleading: name lookup preferrs
clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using
directive is).
llvm-svn: 344850
Summary:
The RefSlab::size can easily cause confusions, it returns the number of
different symbols, rahter than the number of all references.
- add numRefs() method and cache it, since calculating it everytime is nontrivial.
- clear misused places.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53389
llvm-svn: 344745
Summary:
Reuse the old -use-dex-index experiment flag for this.
To avoid breaking the tests, make Dex deduplicate symbols, addressing an old FIXME.
Reviewers: hokein
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53288
llvm-svn: 344594
Summary:
This allows inheriting from it, so index() can ga away and allowing
TestTU::index) to be fixed.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52250
llvm-svn: 343780
Summary:
Pros:
o Loading macros from preamble for every completion is slow (see profile).
o Calculating macro USR is also slow (see profile).
o Sema can provide a lot of macro completion results (e.g. when filter is empty,
60k for some large TUs!).
Cons:
o Slight memory increase in dynamic index (~1%).
o Some extra work during preamble build (should be fine as preamble build and
indexAST is way slower).
Before:
{F7195645}
After:
{F7195646}
Reviewers: ilya-biryukov, sammccall
Reviewed By: sammccall
Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52078
llvm-svn: 342529
Summary:
FileIndex now provides explicit interfaces for preamble and main file updates.
This avoids growing parameter list when preamble and main symbols diverge
further (e.g. D52078). This also gets rid of the hack in `indexAST` that
inferred main file index based on `TopLevelDecls`.
Also separate `indexMainDecls` from `indexAST`.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52222
llvm-svn: 342460
Currently, `SymbolIndex::estimateMemoryUsage()` returns the "overhead"
estimate, i.e. the estimate of the Index data structure excluding
backing data (such as Symbol Slab and Reference Slab). This patch
propagates information about paired data size where necessary.
Reviewed By: ioeric, sammccall
Differential Revision: https://reviews.llvm.org/D51539
llvm-svn: 341800
Summary:
A few things that I noticed while merging the SwapIndex patch:
- SymbolOccurrences and particularly SymbolOccurrenceSlab are unwieldy names,
and these names appear *a lot*. Ref, RefSlab, etc seem clear enough
and read/format much better.
- The asymmetry between SymbolSlab and RefSlab (build() vs freeze()) is
confusing and irritating, and doesn't even save much code.
Avoiding RefSlab::Builder was my idea, but it was a bad one; add it.
- DenseMap<SymbolID, ArrayRef<Ref>> seems like a reasonable compromise for
constructing MemIndex - and means many less wasted allocations than the
current DenseMap<SymbolID, vector<Ref*>> for FileIndex, and none for
slabs.
- RefSlab::find() is not actually used for anything, so we can throw
away the DenseMap and keep the representation much more compact.
- A few naming/consistency fixes: e.g. Slabs,Refs -> Symbols,Refs.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51605
llvm-svn: 341368
Summary:
This is now handled by a wrapper class SwapIndex, so MemIndex/DexIndex can be
immutable and focus on their job.
Old and busted:
I have a MemIndex, which holds a shared_ptr<vector<Symbol*>>, which keeps the
symbol slab alive. I update by calling build(shared_ptr<vector<Symbol*>>).
New hotness: I have a SwapIndex, which holds a unique_ptr<SymbolIndex>, which
holds a MemIndex, which holds a shared_ptr<void>, which keeps backing
data alive.
I update by building a new MemIndex and calling SwapIndex::reset().
Reviewers: kbobyrev, ioeric
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51422
llvm-svn: 341318
This patch prints information about built index size estimation to
verbose logs. This is useful for optimizing memory usage of DexIndex and
comparisons with MemIndex.
Reviewed by: sammccall
Differential Revision: https://reviews.llvm.org/D51154
llvm-svn: 340601
Summary:
It was previously only indexing the preamble decls. The new
implementation will index both the preamble and the main AST and
report both sets of symbols, preferring the ones from the main AST
whenever the symbol is present in both.
The symbols in the main AST slab always store all information
available in the preamble symbols, possibly adding more,
e.g. definition locations.
Reviewers: hokein, ioeric
Reviewed By: ioeric
Subscribers: kadircet, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50889
llvm-svn: 340404
Summary:
This is the first step of implementing Xrefs in clangd:
- add index interfaces, and related data structures.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D49658
llvm-svn: 339011
Summary: Surface it in the completion items C++ API, and when a flag is set.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48938
llvm-svn: 336309
Summary:
This allows dynamic index to have consistent URI schemes with the
static index which can have customized URI schemes, which would make file
proximity scoring based on URIs easier.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D47931
llvm-svn: 334809
Summary:
This is more efficient and avoids data races when reading files that
come from the preamble. The staleness can occur when reading a file
from disk that changed after the preamble was built. This can lead to
crashes, e.g. when parsing comments.
We do not to rely on symbols from the main file anyway, since any info
that those provide can always be taken from the AST.
Reviewers: ioeric, sammccall
Reviewed By: ioeric
Subscribers: malaperle, klimek, javed.absar, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D47272
llvm-svn: 333196
Summary:
This patch adds index support for GoToDefinition -- when we don't get the
definition from local AST, we query our index (Static&Dynamic) index to
get it.
Since we currently collect top-level symbol in the index, it doesn't support all
cases (e.g. class members), we will extend the index to include more symbols in
the future.
Reviewers: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, MaskRay, cfe-commits
Differential Revision: https://reviews.llvm.org/D45717
llvm-svn: 331189
Summary:
Potential use case: argument go-to-definition result with symbol
information (e.g. function definition in cc file) that might not be in the AST.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D44305
llvm-svn: 327487
Summary:
This is an important ranking signal.
It's off for the dynamic index for now. Correspondingly, tell the index
infrastructure only to report declarations for the dynamic index.
Reviewers: ioeric, hokein
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D44315
llvm-svn: 327275
Summary:
Currently, we pick the first declaration of a symbol in a TU, which is considered
canonical in the clangIndex, as the canonical declaration in clangd. This causes
forward declarations that might appear in a random header to be used as a
canonical declaration, which is not desirable for features like go-to-declaration
or include insertion.
For example, for class X, we would consider the forward declaration in fwd.h to
be the canonical declaration, while the preferred canonical declaration should
be the actual definition in x.h.
```
// fwd.h
class X; // forward decl
// x.h
class X {};
```
This patch fixes the issue by making symbol collector favor the actual definition of
a TagDecl (i.e. class/struct/enum/union) found in a header file over the first seen
declarations in a TU. Other symbol types like functions are not handled because
using the first seen declarations as canonical declarations is usually a good
heuristic for them.
Reviewers: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43823
llvm-svn: 326313
Summary:
The new behaviors introduced by this patch:
o When include collection is enabled, we always set IncludeHeader field in Symbol
even if it's the same as FileURI in decl.
o Disable include collection in FileIndex which is currently only used to build
dynamic index. We should revisit when we actually want to use FileIndex to global
index.
o Code-completion only uses IncludeHeader to insert headers but not FileURI in
CanonicalDeclaration. This ensures that inserted headers are always canonicalized.
Note that include insertion can still be triggered for symbols that are already
included if they are merged from dynamic index and static index, but we would
only use includes that are already canonicalized (e.g. from static index).
Reason for change:
Collecting header includes in dynamic index enables inserting includes for headers
that are not indexed but opened in the editor. Comparing to inserting includes for
symbols in global/static index, this is nice-to-have but would probably require
non-trivial amount of work to get right. For example:
o Currently it's not easy to fully support CanonicalIncludes in dynamic index, given the way
we run dynamic index.
o It's also harder to reason about the correctness of include canonicalization for dynamic index
(i.e. symbols in the current file/TU) than static index where symbols are collected
offline and sanity check is possible before shipping to production.
o We have less control/flexibility over symbol info in the dynamic index
(e.g. URIs, path normalization), which could be used to help make decision when inserting includes.
As header collection (especially canonicalization) is relatively new, and enabling
it for dynamic index would immediately affect current users with only dynamic
index support, I propose we disable it for dynamic index for now to avoid
compromising other hot features like code completion and only support it for
static index where include insertion would likely to bring more value.
Reviewers: ilya-biryukov, sammccall, hokein
Subscribers: klimek, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43550
llvm-svn: 325764
Summary:
o Avoid inserting a header include into the header itself.
o Avoid inserting non-header files (by not indexing symbols in main
files at all).
o Canonicalize include paths for symbols in dynamic index.
Reviewers: sammccall, ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: klimek, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43462
llvm-svn: 325523
Summary:
Instead of passing Context explicitly around, we now have a thread-local
Context object `Context::current()` which is an implicit argument to
every function.
Most manipulation of this should use the WithContextValue helper, which
augments the current Context to add a single KV pair, and restores the
old context on destruction.
Advantages are:
- less boilerplate in functions that just propagate contexts
- reading most code doesn't require understanding context at all, and
using context as values in fewer places still
- fewer options to pass the "wrong" context when it changes within a
scope (e.g. when using Span)
- contexts pass through interfaces we can't modify, such as VFS
- propagating contexts across threads was slightly tricky (e.g.
copy vs move, no move-init in lambdas), and is now encapsulated in
the threadpool
Disadvantages are all the usual TLS stuff - hidden magic, and
potential for higher memory usage on threads that don't use the
context. (In practice, it's just one pointer)
Reviewers: ilya-biryukov
Subscribers: klimek, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D42517
llvm-svn: 323872
Summary:
To stay fast, it avoids deserializing anything outside the current file, by
disabling the LoadExternal code completion option added in r322377, when the
index is enabled.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41996
llvm-svn: 322387
Summary:
o We only collect symbols in namespace or translation unit scopes.
o Add an option to only collect symbols in included headers.
Reviewers: hokein, ilya-biryukov
Reviewed By: hokein, ilya-biryukov
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D41823
llvm-svn: 322193
Summary:
This improves a few things:
- the insert -> freeze -> read sequence is now enforced/communicated by the
type system
- SymbolSlab::const_iterator iterates over symbols, not over id-symbol pairs
- we avoid permanently storing a second copy of the IDs, and the
string map's hashtable
The slab size is now down to 21.8MB for the LLVM project.
Of this only 2.7MB is strings, the rest is #symbols * `sizeof(Symbol)`.
`sizeof(Symbol)` is currently 96, which seems too big - I think
SymbolInfo isn't efficiently packed. That's a topic for another patch!
Also added simple API to see the memory usage/#symbols of a slab, since
it seems likely we will continue to care about this.
Reviewers: ilya-biryukov
Subscribers: klimek, mgrang, cfe-commits
Differential Revision: https://reviews.llvm.org/D41506
llvm-svn: 321412