This patch is a proof-of-concept Dex index implementation. It has
several flaws, which don't allow replacing static MemIndex yet, such as:
* Not being able to handle queries of small size (less than 3 symbols);
a way to solve this is generating trigrams of smaller size and having
such incomplete trigrams in the index structure.
* Speed measurements: while manually editing files in Vim and requesting
autocompletion gives an impression that the performance is at least
comparable with the current static index, having actual numbers is
important because we don't want to hurt the users and roll out slow
code. Eric (@ioeric) suggested that we should only replace MemIndex as
soon as we have the evidence that this is not a regression in terms of
performance. An approach which is likely to be successful here is to
wait until we have benchmark library in the LLVM core repository, which
is something I have suggested in the LLVM mailing lists, received
positive feedback on and started working on. I will add a dependency as
soon as the suggested patch is out for a review (currently there's at
least one complication which is being addressed by
https://github.com/google/benchmark/pull/649). Key performance
improvements for iterators are sorting by cost and the limit iterator.
* Quality measurements: currently, boosting iterator and two-phase
lookup stage are not implemented, without these the quality is likely to
be worse than the current implementation can yield. Measuring quality is
tricky, but another suggestion in the offline discussion was that the
drop-in replacement should only happen after Boosting iterators
implementation (and subsequent query enhancement).
The proposed changes do not affect Clangd functionality or performance,
`DexIndex` is only used in unit tests and not in production code.
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50337
llvm-svn: 340175
Proposed changes:
* Cleanup comments in `clangd/index/dex/Iterator.h`: Vim's `gq`
formatting added redundant spaces instead of newlines in few
places
* Few comments in `OrIterator` are wrong
* Use `EXPECT_TRUE(Condition)` instead of
`EXPECT_THAT(Condition, true)` (same with `EXPECT_FALSE`)
* Don't expose `dump()` method to the public by misplacing
`private:`
This patch does not affect functionality.
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50956
llvm-svn: 340157
Summary:
Currently we only add parantheses to the functions if snippets are
enabled, which also inserts snippets for parameters into parantheses. Adding a
new option to put only parantheses. Also it moves the cursor within parantheses
or at the end of them by looking at whether completion item has any parameters
or not. Still requires snippets support on the client side.
Reviewers: ioeric, ilya-biryukov, hokein
Reviewed By: ioeric
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50835
llvm-svn: 340040
Summary: This is a patch of add a testcase for https://reviews.llvm.org/D50628.
Reviewers: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50627
llvm-svn: 340035
Summary:
Sema can only be used for documentation in the current file, other doc
comments should be fetched from the index.
Reviewers: hokein, ioeric, kadircet
Reviewed By: hokein, kadircet
Subscribers: MaskRay, jkorous, mgrang, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50727
llvm-svn: 340005
Summary:
Fix an inconsistent behavior of using `LastBuiltPreamble`/`NewPreamble`
in TUScheduler (see the test for details), AST should always use
NewPreamble. This patch makes LastBuiltPreamble always point to
NewPreamble.
Preamble rarely fails to build, even there are errors in headers, so we
assume it would not cause performace issue for code completion.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50695
llvm-svn: 340001
This patch improves `dex::Iterator` string representation by
incorporating the information about the element which is currently being
pointed to by the `DocumentIterator`.
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50689
llvm-svn: 339877
Summary:
To avoid producing very verbose output in substitutions involving
typedefs, e.g.
T -> std::vector<std::string>::iterator
gets turned into an unreadable mess when printed out for libstdc++,
result contains internal types (std::__Vector_iterator<...>) and
expanded well-defined typedefs (std::basic_string<char>).
Until we improve the presentation code in clang, going with
non-instantiated decls looks like a better UX trade-off.
Reviewers: hokein, ioeric, kadircet
Reviewed By: hokein
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50645
llvm-svn: 339665
This patch handles trigram generation "short" identifiers and queries.
Trigram generator produces incomplete trigrams for short names so that
the same query iterator API can be used to match symbols which don't
have enough symbols to form a trigram and correctly handle queries which
also are not sufficient for generating a full trigram.
Reviewed by: ioeric
Differential revision: https://reviews.llvm.org/D50517
llvm-svn: 339548
Summary:
When compile_commands.json contains some source files expressed as
relative paths, we can get duplicate responses to findDefinitions. The
responses only differ by the URI, which are different versions of the
same file:
"result": [
{
...
"uri": "file:///home/emaisin/src/ls-interact/cpp-test/build/../src/first.h"
},
{
...
"uri": "file:///home/emaisin/src/ls-interact/cpp-test/src/first.h"
}
]
In getAbsoluteFilePath, we try to obtain the realpath of the FileEntry
by calling tryGetRealPathName. However, this can fail and return an
empty string. It may be bug a bug in clang, but in any case we should
fall back to computing it ourselves if it happens.
I changed getAbsoluteFilePath so that if tryGetRealPathName succeeds, we
return right away (a real path is always absolute). Otherwise, we try
to build an absolute path, as we did before, but we also call
VFS->getRealPath to make sure to get the canonical path (e.g. without
any ".." in it).
Reviewers: malaperle
Subscribers: hokein, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48687
llvm-svn: 339483
This patch modifies `consume` function to allow retrieval of limited
number of symbols. This is the "cheap" implementation of top-level
limiting iterator. In the future we would like to have a complete limit
iterator implementation to insert it into the query subtrees, but in the
meantime this version would be enough for a fully-functional
proof-of-concept Dex implementation.
Reviewers: ioeric, ilya-biryukov
Reviewed by: ioeric
Differential Revision: https://reviews.llvm.org/D50500
llvm-svn: 339426
Summary:
This allows implementations like different symbol indexes to know what
the current active file is. For example, some customized index implementation
might decide to only return results for some files.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: javed.absar, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50446
llvm-svn: 339320
Summary:
This is the first step of implementing Xrefs in clangd:
- add index interfaces, and related data structures.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D49658
llvm-svn: 339011
The diagnostic messages that are sent to the client from Clangd are now always
capitalized.
Differential Revision: https://reviews.llvm.org/D50154
llvm-svn: 338919
Summary:
After r338256, clangd stopped reporting diagnostics if WantDiags::No request
is followed by a WantDiags::Yes request but the AST can be reused.
Reviewers: ioeric
Reviewed By: ioeric
Subscribers: javed.absar, MaskRay, jkorous, arphaman, jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D50045
llvm-svn: 338361
The original Dex Iterators patch (https://reviews.llvm.org/rL338017)
caused problems for Clang 3.6 and Clang 3.7 due to the compiler bug
which prevented inferring template parameter (`Size`) in create(And|Or)?
functions. It was reverted in https://reviews.llvm.org/rL338054.
In this revision the mentioned helper functions were replaced with
variadic templated versions.
Proposed changes were tested on multiple compiler versions, including
Clang 3.6 which originally caused the failure.
llvm-svn: 338116
This patch introduces three essential types of query iterators:
`DocumentIterator`, `AndIterator`, `OrIterator`. It provides a
convenient API for query tree generation and serves as a building block
for the next generation symbol index - Dex. Currently, many
optimizations are missed to improve code readability and to serve as the
reference implementation. Potential improvements are briefly mentioned
in `FIXME`s and will be addressed in the following patches.
Dex RFC in the mailing list:
http://lists.llvm.org/pipermail/clangd-dev/2018-July/000022.html
Iterators, their applications and potential extensions are explained in
detail in the design proposal:
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit#heading=h.903u1zon9nkj
Reviewers: ioeric, sammccall, ilya-biryukov
Subscribers: cfe-commits, klimek, jfb, mgrang, mgorny, MaskRay, jkorous,
arphaman
Differential Revision: https://reviews.llvm.org/D49546
llvm-svn: 338017
Summary:
If the contents are the same, the update most likely comes from the
fact that compile commands were invalidated. In that case we want to
avoid rebuilds in case the compile commands are actually the same.
Reviewers: ioeric
Reviewed By: ioeric
Subscribers: simark, javed.absar, MaskRay, jkorous, arphaman, jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D49783
llvm-svn: 338012
Summary:
This has a shape to similar logarithm function but grows much slower for
large #usages.
Metrics: https://reviews.llvm.org/P8096
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, cfe-commits, sammccall
Differential Revision: https://reviews.llvm.org/D49780
llvm-svn: 337907
This patch introduces the core building block of the next-generation
Clangd symbol index - Dex. Search tokens are the keys in the inverted
index and represent a characteristic of a specific symbol: examples of
search token types (Token Namespaces) are
* Trigrams - these are essential for unqualified symbol name fuzzy
search * Scopes for filtering the symbols by the namespace * Paths, e.g.
these can be used to uprank symbols defined close to the edited file
This patch outlines the generic for such token namespaces, but only
implements trigram generation.
The intuition behind trigram generation algorithm is that each extracted
trigram is a valid sequence for Fuzzy Matcher jumps, proposed
implementation utilize existing FuzzyMatcher API for segmentation and
trigram extraction.
However, trigrams generation algorithm for the query string is different
from the previous one: it simply yields sequences of 3 consecutive
lowercased valid characters (letters, digits).
Dex RFC in the mailing list:
http://lists.llvm.org/pipermail/clangd-dev/2018-July/000022.html
The trigram generation techniques are described in detail in the
proposal:
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit#heading=h.903u1zon9nkj
Reviewers: sammccall, ioeric, ilya-biryukovA
Subscribers: cfe-commits, klimek, mgorny, MaskRay, jkorous, arphaman
Differential Revision: https://reviews.llvm.org/D49591
llvm-svn: 337901
Summary:
The following are metrics for explicit member access completions. There is no
noticeable impact on other completion types.
Before:
EXPLICIT_MEMBER_ACCESS
Total measurements: 24382
All measurements: MRR: 62.27 Top10: 80.21% Top-100: 94.48%
Full identifiers: MRR: 98.81 Top10: 99.89% Top-100: 99.95%
0-5 filter len:
MRR: 13.25 46.31 62.47 67.77 70.40 81.91
Top-10: 29% 74% 84% 91% 91% 97%
Top-100: 67% 99% 99% 99% 99% 100%
After:
EXPLICIT_MEMBER_ACCESS
Total measurements: 24382
All measurements: MRR: 63.18 Top10: 80.58% Top-100: 95.07%
Full identifiers: MRR: 98.79 Top10: 99.89% Top-100: 99.95%
0-5 filter len:
MRR: 13.84 48.39 63.55 68.83 71.28 82.64
Top-10: 30% 75% 84% 91% 91% 97%
Top-100: 70% 99% 99% 99% 99% 100%
* Top-N: wanted result is found in the first N completion results.
* MRR: Mean reciprocal rank.
Remark: the change seems to have minor positive impact. Although the improvement
is relatively small, down-ranking non-instance members in instance member access
should reduce noise in the completion results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D49543
llvm-svn: 337681
Summary: This is intended to be used for indexing, e.g. in D49417
Reviewers: ioeric, omtcyfz
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D49540
llvm-svn: 337527
Having `using qualified::name;` for some symbol is an important signal
for clangd code completion as the user is more likely to use such
symbol. This patch helps to uprank the relevant symbols by saving
UsingShadowDecl in the new field of CodeCompletionResult and checking
whether the corresponding UsingShadowDecl is located in the main file
later in ClangD code completion routine. While the relative importance
of such signal is a subject to change in the future, this patch simply
bumps DeclProximity score to the value of 1.0 which should be enough for
now.
The patch was tested using
`$ ninja check-clang check-clang-tools`
No unexpected failures were noticed after running the relevant testsets.
Reviewers: sammccall, ioeric
Subscribers: MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D49012
llvm-svn: 336810
Summary:
Sema code complete in the recovery mode is generally useless. For many
cases, sema first completes in recovery context and then recovers to more useful
context, in which it's favorable to ignore results from recovery (as results are
often bad e.g. all builtin symbols and top-level symbols). There is also case
where only sema would fail to recover e.g. completions in excluded #if block.
Sema would try to give results, but the results are often useless (see the updated
excluded #if block test).
Reviewers: sammccall, ilya-biryukov
Subscribers: MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D49175
llvm-svn: 336801
Summary: This is not enabled in the global-symbol-builder or dynamic index yet.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D49028
llvm-svn: 336553
Summary:
The library has graduated from clangd to llvm/Support.
This is a mechanical change to move to the new API and remove the old one.
Main API changes:
- namespace clang::clangd::json --> llvm::json
- json::Expr --> json::Value
- Expr::asString() etc --> Value::getAsString() etc
- unsigned longs need a cast (due to r336541 adding lossless integer support)
Reviewers: ilya-biryukov
Subscribers: mgorny, ioeric, MaskRay, jkorous, omtcyfz, cfe-commits
Differential Revision: https://reviews.llvm.org/D49077
llvm-svn: 336549
Summary:
To avoid doing extra work of processing headers in the preamble
mutilple times in parallel.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: javed.absar, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48940
llvm-svn: 336538
Summary:
An AST-based approach is used to retrieve the document symbols rather than an
in-memory index query. The index is not an ideal fit to achieve this because of
the file-centric query being done here whereas the index is suited for
project-wide queries. Document symbols also includes more symbols and need to
keep the order as seen in the file.
Signed-off-by: Marc-Andre Laperle <marc-andre.laperle@ericsson.com>
Subscribers: tomgr, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D47846
llvm-svn: 336386
Summary:
Following D48903 ([VirtualFileSystem] InMemoryFileSystem::status: Return
a Status with the requested name), the paths output by clang-move in the
FileToReplacements map may contain leading "./". For example, where we
would get "foo.h", we'll now get "./foo.h". This breaks the tests,
because we are doing exact string lookups in the FileToFileID and
Results maps (they contain "foo.h", but we search for "./foo.h").
To mitigate this, try to normalize a little bit the paths output by
clang-move to remove that leading "./".
This patch should be safe to merge before D48903, remove_dots will just
be a no-op.
Reviewers: ilya-biryukov, hokein
Reviewed By: hokein
Subscribers: ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D48951
llvm-svn: 336358
Summary: Surface it in the completion items C++ API, and when a flag is set.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48938
llvm-svn: 336309