This patch introduces `PathURI` Search Token kind and utilizes it to
uprank symbols which are defined in files with small distance to the
directory where the fuzzy find request is coming from (e.g. files user
is editing).
Reviewed By: ioeric
Reviewers: ioeric, sammccall
Differential Revision: https://reviews.llvm.org/D51481
llvm-svn: 341542
Summary:
GoToDefinition returns all declaration results (implicit/explicit) that are
in the same location, and the results are returned in arbitrary order.
Some LSP clients defaultly take the first result as the final result, which
might present a bad result (implicit decl) to users.
This patch ranks the result based on whether the declarations are
referenced explicitly/implicitly. We put explicit declarations first.
This also improves the "hover" (which just take the first result) feature
in some cases.
Reviewers: ilya-biryukov
Subscribers: kadircet, ioeric, MaskRay, jkorous, mgrang, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50438
llvm-svn: 341463
Summary:
This is intended to replace the current YAML format for general use.
It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML:
llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M
It's also simpler/faster to read and write.
The format is a RIFF container (chunks of (type, size, data)) with:
- a compressed string table
- simple binary encoding of symbols (with varints for compactness)
It can be extended to include occurrences, Dex posting lists, etc.
There's no rich backwards-compatibility scheme, but a version number is included
so we can detect incompatible files and do ad-hoc back-compat.
Alternatives considered:
- compressed YAML or JSON: bulky and slow to load
- llvm bitstream: confusing model and libraries are hard to use. My attempt
produced slightly larger files, and the code was longer and slower.
- protobuf or similar: would be really nice (esp for back-compat) but the
dependency is a big hassle
- ad-hoc binary format without a container: it seems clear we're going
to add posting lists and occurrences here, and that they will benefit
from sharing a string table. The container makes it easy to debug
these pieces in isolation, and make them optional.
Reviewers: ioeric
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51585
llvm-svn: 341375
Summary:
A few things that I noticed while merging the SwapIndex patch:
- SymbolOccurrences and particularly SymbolOccurrenceSlab are unwieldy names,
and these names appear *a lot*. Ref, RefSlab, etc seem clear enough
and read/format much better.
- The asymmetry between SymbolSlab and RefSlab (build() vs freeze()) is
confusing and irritating, and doesn't even save much code.
Avoiding RefSlab::Builder was my idea, but it was a bad one; add it.
- DenseMap<SymbolID, ArrayRef<Ref>> seems like a reasonable compromise for
constructing MemIndex - and means many less wasted allocations than the
current DenseMap<SymbolID, vector<Ref*>> for FileIndex, and none for
slabs.
- RefSlab::find() is not actually used for anything, so we can throw
away the DenseMap and keep the representation much more compact.
- A few naming/consistency fixes: e.g. Slabs,Refs -> Symbols,Refs.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51605
llvm-svn: 341368
Summary:
- DynamicIndex doesn't implement ParsingCallbacks, to make its role clearer.
ParsingCallbacks is a separate object owned by the receiving TUScheduler.
(I tried to get rid of the "index-like-object that doesn't implement index"
but it was too messy).
- Clarified(?) docs around DynamicIndex - fewer details up front, more details
inside.
- Exposed dynamic index from ClangdServer for memory monitoring and more
direct testing of its contents (actual tests not added here, wanted to get
this out for review)
- Removed a redundant and sligthly confusing filename param in a callback
Reviewers: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51221
llvm-svn: 341325
Summary:
This is now handled by a wrapper class SwapIndex, so MemIndex/DexIndex can be
immutable and focus on their job.
Old and busted:
I have a MemIndex, which holds a shared_ptr<vector<Symbol*>>, which keeps the
symbol slab alive. I update by calling build(shared_ptr<vector<Symbol*>>).
New hotness: I have a SwapIndex, which holds a unique_ptr<SymbolIndex>, which
holds a MemIndex, which holds a shared_ptr<void>, which keeps backing
data alive.
I update by building a new MemIndex and calling SwapIndex::reset().
Reviewers: kbobyrev, ioeric
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51422
llvm-svn: 341318
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
SymbolCollector will be used for two cases:
- collect Symbol type only, used for indexing preamble AST.
- collect Symbol and SymbolOccurrences, used for indexing main AST.
For finding local references from the AST, we will implement it in other ways.
llvm-svn: 341208
Summary:
After code completion inserts a header, running signature help using the old
preamble will usually fail. So we add support for consistent preamble reads.
Reviewers: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51438
llvm-svn: 341076
Summary: Only accessible via the C++ API at the moment.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51437
llvm-svn: 341065
This patch introduces iterator cost concept to improve the performance
of Dex query iterators (mainly, AND iterator). Benchmarks show that the
queries become ~10% faster.
Before
```
-------------------------------------------------------
Benchmark Time CPU Iteration
-------------------------------------------------------
DexAdHocQueries 5883074 ns 5883018 ns 117
DexRealQ 959904457 ns 959898507 ns 1
```
After
```
-------------------------------------------------------
Benchmark Time CPU Iteration
-------------------------------------------------------
DexAdHocQueries 5238403 ns 5238361 ns 130
DexRealQ 873275207 ns 873269453 ns 1
```
Reviewed by: sammccall
Differential Revision: https://reviews.llvm.org/D51310
llvm-svn: 341057
This patch introduces LIMIT iterator, which is very important for
improving the quality of search query. LIMIT iterators can be applied on
top of BOOST iterators to prevent populating query request with a huge
number of low-quality symbols.
Reviewed by: sammccall
Differential Revision: https://reviews.llvm.org/D51029
llvm-svn: 340605
Summary:
For index-based code completion, send an asynchronous speculative index
request, based on the index request for the last code completion on the same
file and the filter text typed before the cursor, before sema code completion
is invoked. This can reduce the code completion latency (by roughly latency of
sema code completion) if the speculative request is the same as the one
generated for the ongoing code completion from sema. As a sequence of code
completions often have the same scopes and proximity paths etc, this should be
effective for a number of code completions.
Trace with speculative index request:{F6997544}
Reviewers: hokein, ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: javed.absar, jfb, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D50962
llvm-svn: 340604
This patch prints information about built index size estimation to
verbose logs. This is useful for optimizing memory usage of DexIndex and
comparisons with MemIndex.
Reviewed by: sammccall
Differential Revision: https://reviews.llvm.org/D51154
llvm-svn: 340601
Summary:
Currently we match an include only if we are inside filename, with this patch we
will match whenever we are on the starting line of the include.
Reviewers: ilya-biryukov, hokein, ioeric
Reviewed By: ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D51163
llvm-svn: 340539
Summary:
Whenever a code-completion is triggered within a class/struct/union looks at
base classes and figures out non-overriden virtual functions. Than suggests
completions for those.
Reviewers: ilya-biryukov, hokein, ioeric
Reviewed By: hokein
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50898
llvm-svn: 340530
Summary:
We were handling the EnableFunctionArgSnippets only when we are producing LSP
response. Move that code into CompletionItem generation so that internal clients
can benefit from that as well.
Reviewers: ilya-biryukov, ioeric, hokein
Reviewed By: ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D51102
llvm-svn: 340527
Some of them timeout on our buildbots in certain configurations.
The timeouts are there to avoid hanging indefinitely on deadlocks, so
the exact number we put there does not matter.
llvm-svn: 340523
This patch introduces BOOST iterator - a substantial block for efficient
and high-quality symbol retrieval. The concept of boosting allows
performing computationally inexpensive scoring on the query side so that
the final (expensive) scoring can only be applied on the items with the
highest preliminary score while eliminating the need to score too many
items.
Reviewed by: ilya-biryukov
Differential Revision: https://reviews.llvm.org/D50970
llvm-svn: 340409
Summary:
Will be used for updating the dynamic index on updates to the open files.
Currently we collect only information coming from the preamble
AST. This has a bunch of limitations:
- Dynamic index misses important information from the body of the
file, e.g. locations of definitions.
- XRefs cannot be collected at all, since we can only obtain full
information for the current file (preamble is parsed with skipped
function bodies, therefore not reliable).
This patch only adds the new callback, actually updates to the index
will be done in a follow-up patch.
Reviewers: hokein
Reviewed By: hokein
Subscribers: kadircet, javed.absar, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50847
llvm-svn: 340401
This patch is a proof-of-concept Dex index implementation. It has
several flaws, which don't allow replacing static MemIndex yet, such as:
* Not being able to handle queries of small size (less than 3 symbols);
a way to solve this is generating trigrams of smaller size and having
such incomplete trigrams in the index structure.
* Speed measurements: while manually editing files in Vim and requesting
autocompletion gives an impression that the performance is at least
comparable with the current static index, having actual numbers is
important because we don't want to hurt the users and roll out slow
code. Eric (@ioeric) suggested that we should only replace MemIndex as
soon as we have the evidence that this is not a regression in terms of
performance. An approach which is likely to be successful here is to
wait until we have benchmark library in the LLVM core repository, which
is something I have suggested in the LLVM mailing lists, received
positive feedback on and started working on. I will add a dependency as
soon as the suggested patch is out for a review (currently there's at
least one complication which is being addressed by
https://github.com/google/benchmark/pull/649). Key performance
improvements for iterators are sorting by cost and the limit iterator.
* Quality measurements: currently, boosting iterator and two-phase
lookup stage are not implemented, without these the quality is likely to
be worse than the current implementation can yield. Measuring quality is
tricky, but another suggestion in the offline discussion was that the
drop-in replacement should only happen after Boosting iterators
implementation (and subsequent query enhancement).
The proposed changes do not affect Clangd functionality or performance,
`DexIndex` is only used in unit tests and not in production code.
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50337
llvm-svn: 340175
Proposed changes:
* Cleanup comments in `clangd/index/dex/Iterator.h`: Vim's `gq`
formatting added redundant spaces instead of newlines in few
places
* Few comments in `OrIterator` are wrong
* Use `EXPECT_TRUE(Condition)` instead of
`EXPECT_THAT(Condition, true)` (same with `EXPECT_FALSE`)
* Don't expose `dump()` method to the public by misplacing
`private:`
This patch does not affect functionality.
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50956
llvm-svn: 340157
Summary:
Currently we only add parantheses to the functions if snippets are
enabled, which also inserts snippets for parameters into parantheses. Adding a
new option to put only parantheses. Also it moves the cursor within parantheses
or at the end of them by looking at whether completion item has any parameters
or not. Still requires snippets support on the client side.
Reviewers: ioeric, ilya-biryukov, hokein
Reviewed By: ioeric
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50835
llvm-svn: 340040
Summary: This is a patch of add a testcase for https://reviews.llvm.org/D50628.
Reviewers: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50627
llvm-svn: 340035
Summary:
Sema can only be used for documentation in the current file, other doc
comments should be fetched from the index.
Reviewers: hokein, ioeric, kadircet
Reviewed By: hokein, kadircet
Subscribers: MaskRay, jkorous, mgrang, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50727
llvm-svn: 340005
Summary:
Fix an inconsistent behavior of using `LastBuiltPreamble`/`NewPreamble`
in TUScheduler (see the test for details), AST should always use
NewPreamble. This patch makes LastBuiltPreamble always point to
NewPreamble.
Preamble rarely fails to build, even there are errors in headers, so we
assume it would not cause performace issue for code completion.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50695
llvm-svn: 340001
This patch improves `dex::Iterator` string representation by
incorporating the information about the element which is currently being
pointed to by the `DocumentIterator`.
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50689
llvm-svn: 339877
Summary:
To avoid producing very verbose output in substitutions involving
typedefs, e.g.
T -> std::vector<std::string>::iterator
gets turned into an unreadable mess when printed out for libstdc++,
result contains internal types (std::__Vector_iterator<...>) and
expanded well-defined typedefs (std::basic_string<char>).
Until we improve the presentation code in clang, going with
non-instantiated decls looks like a better UX trade-off.
Reviewers: hokein, ioeric, kadircet
Reviewed By: hokein
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50645
llvm-svn: 339665
This patch handles trigram generation "short" identifiers and queries.
Trigram generator produces incomplete trigrams for short names so that
the same query iterator API can be used to match symbols which don't
have enough symbols to form a trigram and correctly handle queries which
also are not sufficient for generating a full trigram.
Reviewed by: ioeric
Differential revision: https://reviews.llvm.org/D50517
llvm-svn: 339548
Summary:
When compile_commands.json contains some source files expressed as
relative paths, we can get duplicate responses to findDefinitions. The
responses only differ by the URI, which are different versions of the
same file:
"result": [
{
...
"uri": "file:///home/emaisin/src/ls-interact/cpp-test/build/../src/first.h"
},
{
...
"uri": "file:///home/emaisin/src/ls-interact/cpp-test/src/first.h"
}
]
In getAbsoluteFilePath, we try to obtain the realpath of the FileEntry
by calling tryGetRealPathName. However, this can fail and return an
empty string. It may be bug a bug in clang, but in any case we should
fall back to computing it ourselves if it happens.
I changed getAbsoluteFilePath so that if tryGetRealPathName succeeds, we
return right away (a real path is always absolute). Otherwise, we try
to build an absolute path, as we did before, but we also call
VFS->getRealPath to make sure to get the canonical path (e.g. without
any ".." in it).
Reviewers: malaperle
Subscribers: hokein, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48687
llvm-svn: 339483
This patch modifies `consume` function to allow retrieval of limited
number of symbols. This is the "cheap" implementation of top-level
limiting iterator. In the future we would like to have a complete limit
iterator implementation to insert it into the query subtrees, but in the
meantime this version would be enough for a fully-functional
proof-of-concept Dex implementation.
Reviewers: ioeric, ilya-biryukov
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50500
llvm-svn: 339426