Commit Graph

23 Commits

Author SHA1 Message Date
Sam McCall 9c7624e14b [clangd] Factor out the data-swapping functionality from MemIndex/DexIndex.
Summary:
This is now handled by a wrapper class SwapIndex, so MemIndex/DexIndex can be
immutable and focus on their job.

Old and busted:
 I have a MemIndex, which holds a shared_ptr<vector<Symbol*>>, which keeps the
 symbol slab alive. I update by calling build(shared_ptr<vector<Symbol*>>).

New hotness: I have a SwapIndex, which holds a unique_ptr<SymbolIndex>, which
 holds a MemIndex, which holds a shared_ptr<void>, which keeps backing
 data alive.
 I update by building a new MemIndex and calling SwapIndex::reset().

Reviewers: kbobyrev, ioeric

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51422

llvm-svn: 341318
2018-09-03 14:37:43 +00:00
Eric Liu 83f63e42b2 [clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.

In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51291

llvm-svn: 341304
2018-09-03 10:18:21 +00:00
Haojian Wu e8064b6f6d [clangd] Implement findOccurrences interface in dynamic index.
Summary:
Implement the interface in
  - FileIndex
  - MemIndex
  - MergeIndex

Depends on https://reviews.llvm.org/D50385.

Reviewers: sammccall, ilya-biryukov

Reviewed By: sammccall

Subscribers: mgrang, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51279

llvm-svn: 341242
2018-08-31 19:53:37 +00:00
Sam McCall 2e5700f038 [clangd] Flatten out Symbol::Details. It was ill-conceived, sorry.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51504

llvm-svn: 341211
2018-08-31 13:55:01 +00:00
Kirill Bobyrev 870aaf2963 [clangd] DexIndex implementation prototype
This patch is a proof-of-concept Dex index implementation. It has
several flaws, which don't allow replacing static MemIndex yet, such as:

* Not being able to handle queries of small size (less than 3 symbols);
  a way to solve this is generating trigrams of smaller size and having
  such incomplete trigrams in the index structure.
* Speed measurements: while manually editing files in Vim and requesting
  autocompletion gives an impression that the performance is at least
  comparable with the current static index, having actual numbers is
  important because we don't want to hurt the users and roll out slow
  code. Eric (@ioeric) suggested that we should only replace MemIndex as
  soon as we have the evidence that this is not a regression in terms of
  performance. An approach which is likely to be successful here is to
  wait until we have benchmark library in the LLVM core repository, which
  is something I have suggested in the LLVM mailing lists, received
  positive feedback on and started working on. I will add a dependency as
  soon as the suggested patch is out for a review (currently there's at
  least one complication which is being addressed by
  https://github.com/google/benchmark/pull/649). Key performance
  improvements for iterators are sorting by cost and the limit iterator.
* Quality measurements: currently, boosting iterator and two-phase
  lookup stage are not implemented, without these the quality is likely to
  be worse than the current implementation can yield. Measuring quality is
  tricky, but another suggestion in the offline discussion was that the
  drop-in replacement should only happen after Boosting iterators
  implementation (and subsequent query enhancement).

The proposed changes do not affect Clangd functionality or performance,
`DexIndex` is only used in unit tests and not in production code.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50337

llvm-svn: 340175
2018-08-20 14:39:32 +00:00
Sam McCall 2161ec7ee2 [clangd] Track origins of symbols (various indexes, Sema).
Summary: Surface it in the completion items C++ API, and when a flag is set.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48938

llvm-svn: 336309
2018-07-05 06:20:41 +00:00
Sam McCall a68951e37e [clangd] More precise representation of symbol names/labels in the index.
Summary:
Previously, the strings matched LSP completion pretty closely.
The completion label was a single string, for instance. This made
implementing completion itself easy but makes it hard to use the names
in other way, e.g. pretty-printed name in synthesized
documentation/hover.

It also limits our introspection into completion items, which can only
be as precise as the indexed symbols. This change is a prerequisite to
improvements to overload bundling which need to inspect e.g. signature
structure.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48475

llvm-svn: 335360
2018-06-22 16:11:35 +00:00
Eric Liu 9ec459ff6b [clangd] Add an interface that finds symbol by SymbolID in SymbolIndex.
Summary:
Potential use case: argument go-to-definition result with symbol
information (e.g. function definition in cc file) that might not be in the AST.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D44305

llvm-svn: 327487
2018-03-14 09:48:05 +00:00
Sam McCall 93f99bf31f [clangd] Collect the number of files referencing a symbol in the static index.
Summary:
This is an important ranking signal.
It's off for the dynamic index for now. Correspondingly, tell the index
infrastructure only to report declarations for the dynamic index.

Reviewers: ioeric, hokein

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D44315

llvm-svn: 327275
2018-03-12 14:49:09 +00:00
Sam McCall 77a719cb9e [clangd] Fix unintentionally loose fuzzy matching, and the tests masking it.
Summary:
The intent was that [ar] doesn't match "FooBar"; the first character must match
a Head character (hard requirement, not just a low score).
This matches VSCode, and was "tested" but the tests were defective.

The tests expected matches("FooBar") to fail for lack of a match. But instead
it fails because the string should be annotated - matches("FooB[ar]").
This patch makes matches("FooBar") ignore annotations, as was intended.

Fixing the code to reject weak matches for the first char causes problems:
-  [bre] no longer matches "HTMLBRElement".
   We allow matching against an uppercase char even if we don't think it's head.
   Only do this if there's at least one lowercase, to avoid triggering on MACROS
-  [print] no longer matches "sprintf".
   This is hard to fix without false positives (e.g. [int] vs "sprintf"])
   This patch leaves this case broken. A future patch will add a dictionary
   providing custom segmentation to common names from the standard library.

Fixed a couple of index tests that indirectly relied on broken fuzzy matching.
Added const in a couple of missing places for consistency with new code.

Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D44003

llvm-svn: 326721
2018-03-05 17:34:33 +00:00
Sam McCall ab8e393f62 [clangd] Invert return value of fuzzyFind() (fix MemIndex's return value)
Have had way too many bugs by converting between "isComplete" and
"isIncomplete". LSP is immovable, so use isIncomplete everywhere.

llvm-svn: 325493
2018-02-19 13:04:41 +00:00
Sam McCall 6003951c66 [clangd] Collect definitions when indexing.
Within a TU:
 - as now, collect a declaration from the first occurrence of a symbol
   (taking clang's canonical declaration)
 - when we first see a definition occurrence, copy the symbol and add it
Across TUs/sources:
 - mergeSymbol in Merge.h is responsible for combining matching Symbols.
   This covers dynamic/static merges and cross-TU merges in the static index.
 - it prefers declarations from Symbols that have a definition.
 - GlobalSymbolBuilderMain is modified to use mergeSymbol as a reduce step.
Random cleanups (can be pulled out):
 - SymbolFromYAML -> SymbolsFromYAML, new singular SymbolFromYAML added
 - avoid uninit'd SymbolLocations. Add an idiomatic way to check "absent".
 - CanonicalDeclaration (as well as Definition) are mapped as optional in YAML.
 - added operator<< for Symbol & SymbolLocation, for debugging

Reviewers: ioeric, hokein

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42942

llvm-svn: 324735
2018-02-09 14:42:01 +00:00
Eric Liu 7f24765912 [clangd] Use URIs in index symbols.
Reviewers: hokein, sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42915

llvm-svn: 324358
2018-02-06 16:10:35 +00:00
Sam McCall d1a7a37c22 [clangd] Pass Context implicitly using TLS.
Summary:
Instead of passing Context explicitly around, we now have a thread-local
Context object `Context::current()` which is an implicit argument to
every function.
Most manipulation of this should use the WithContextValue helper, which
augments the current Context to add a single KV pair, and restores the
old context on destruction.

Advantages are:
- less boilerplate in functions that just propagate contexts
- reading most code doesn't require understanding context at all, and
  using context as values in fewer places still
- fewer options to pass the "wrong" context when it changes within a
  scope (e.g. when using Span)
- contexts pass through interfaces we can't modify, such as VFS
- propagating contexts across threads was slightly tricky (e.g.
  copy vs move, no move-init in lambdas), and is now encapsulated in
  the threadpool

Disadvantages are all the usual TLS stuff - hidden magic, and
potential for higher memory usage on threads that don't use the
context. (In practice, it's just one pointer)

Reviewers: ilya-biryukov

Subscribers: klimek, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D42517

llvm-svn: 323872
2018-01-31 13:40:48 +00:00
Sam McCall a5ce243fa3 [clangd] Use fuzzy match to select top N index results.
Summary:
This makes performance slower but more predictable (it always processes
every symbol). We need to find ways to make this fast, possibly by precomputing
short queries or capping the number of scored results. But our current approach
is too naive.

It also no longer returns results in a "good" order. In fact it's pathological:
the top N results are ranked from worst to best. Indexes aren't responsible for
ranking and MergedIndex can't do a good job, so I'm pleased that this will make
any hidden assumptions we have more noticeable :-)

Reviewers: hokein

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D42060

llvm-svn: 322821
2018-01-18 08:35:04 +00:00
Sam McCall 0faecf0c33 [clangd] Merge results from static/dynamic index.
Summary:
We now hide the static/dynamic split from the code completion, behind a
new implementation of the SymbolIndex interface. This will reduce the
complexity of the sema/index merging that needs to be done by
CodeComplete, at a fairly small cost in flexibility.

Reviewers: hokein

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D42049

llvm-svn: 322480
2018-01-15 12:33:00 +00:00
Sam McCall 4b9bbb378b [clangd] Use Builder for symbol slabs, and use sorted-vector for storage
Summary:
This improves a few things:
 - the insert -> freeze -> read sequence is now enforced/communicated by the
   type system
 - SymbolSlab::const_iterator iterates over symbols, not over id-symbol pairs
 - we avoid permanently storing a second copy of the IDs, and the
   string map's hashtable

The slab size is now down to 21.8MB for the LLVM project.
Of this only 2.7MB is strings, the rest is #symbols * `sizeof(Symbol)`.
`sizeof(Symbol)` is currently 96, which seems too big - I think
SymbolInfo isn't efficiently packed. That's a topic for another patch!

Also added simple API to see the memory usage/#symbols of a slab, since
it seems likely we will continue to care about this.

Reviewers: ilya-biryukov

Subscribers: klimek, mgrang, cfe-commits

Differential Revision: https://reviews.llvm.org/D41506

llvm-svn: 321412
2017-12-23 19:38:03 +00:00
Sam McCall 6c0d0f5775 [clangd] Index symbols share storage within a slab.
Summary:
Symbols are not self-contained - it's only safe to hand them out if you
guarantee the lifetime of the underlying data.

Before this lands, I'm going to measure the before/after memory usage of the
LLVM index loaded into memory in a single slab.

Reviewers: hokein

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41483

llvm-svn: 321272
2017-12-21 14:58:44 +00:00
Eric Liu 125cda7d3b [clangd] Igore cases in index fuzzy find.
llvm-svn: 321157
2017-12-20 09:29:54 +00:00
Eric Liu 6f648df1b9 [clangd] Index-based code completion.
Summary: Use symbol index to populate completion results for qualfified IDs e.g. "nx::A^".

Reviewers: ilya-biryukov, sammccall

Reviewed By: ilya-biryukov, sammccall

Subscribers: rwols, klimek, mgorny, cfe-commits, sammccall

Differential Revision: https://reviews.llvm.org/D41281

llvm-svn: 321083
2017-12-19 16:50:37 +00:00
Eric Liu 4feda80a43 [clangd] Support filtering by fixing scopes in fuzzyFind.
Summary: When scopes are specified, only match symbols from scopes.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41367

llvm-svn: 321067
2017-12-19 11:37:40 +00:00
Eric Liu a2d4607a4b [clangd] Fix a potential use-after-move bug.
llvm-svn: 320695
2017-12-14 12:31:04 +00:00
Eric Liu 3732cadc73 [clangd] Symbol index interfaces and an in-memory index implementation.
Summary:
o Index interfaces to support using different index sources (e.g. AST index, global index) for code completion, cross-reference finding etc. This patch focuses on code completion.

The following changes in the original patch has been split out.
o Implement an AST-based index.
o Add an option to replace sema code completion for qualified-id with index-based completion.
o Implement an initial naive code completion index which matches symbols that have the query string as substring.

Reviewers: malaperle, sammccall

Reviewed By: sammccall

Subscribers: hokein, klimek, malaperle, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D40548

llvm-svn: 320688
2017-12-14 11:25:49 +00:00