CodeCompletionContext::Kind has 36 Kinds. The completion model currently
only handles categorical features of 32 cardinality.
Changing the datatype to uint64_t will solve the problem.
This reverts commit 438b5bb05a.
This makes code completion use a Decision Forest based ranking algorithm to rank
completion candidates. [Esitmated 6% accuracy boost]. This was
previously hidden behind the flag --ranking-model=decision_forest. This
patch makes it the default ranking algorithm.
Note: this is a generic model, not specialized for any particular
project. clangd does not collect or upload data to train code completion.
Also treat Keywords separately as they are not recorded by the training set generator.
Differential Revision: https://reviews.llvm.org/D96353
This patch only introduces new signals but does not use their value
in scoring a CC candidate. Usage of these signals in CC ranking in both
heiristics and ML model will be introduced in later patches.
Differential Revision: https://reviews.llvm.org/D94473
With every incremental change, one needs to check-in new model upstream.
This also significantly increases the size of the git repo with every
new model.
Testing and comparing the old and previous model is also not possible as
we run only a single model at any point.
One solution is to have a "staging" decision forest which can be
injected into clangd without pushing it to upstream. Compare the
performance of the staging model with the live model. After a couple of
enhancements have been done to staging model, we can then replace the
live model upstream with the staging model. This reduces upstream churn
and also allows us to compare models with current baseline model.
This is done by having a callback in CodeCompleteOptions which is called
only when we want to use a decision forest ranking model. This allows us
to inject different completion model internally.
Differential Revision: https://reviews.llvm.org/D90014
By default clangd will score a code completion item using heuristics model.
Scoring can be done by Decision Forest model by passing `--ranking_model=decision_forest` to
clangd.
Features omitted from the model:
- `NameMatch` is excluded because the final score must be multiplicative in `NameMatch` to allow rescoring by the editor.
- `NeedsFixIts` is excluded because the generating dataset that needs 'fixits' is non-trivial.
There are multiple ways (heuristics) to combine the above two features with the prediction of the DF:
- `NeedsFixIts` is used as is with a penalty of `0.5`.
Various alternatives of combining NameMatch `N` and Decision forest Prediction `P`
- N * scale(P, 0, 1): Linearly scale the output of model to range [0, 1]
- N * a^P:
- More natural: Prediction of each Decision Tree can be considered as a multiplicative boost (like NameMatch)
- Ordering is independent of the absolute value of P. Order of two items is proportional to `a^{difference in model prediction score}`. Higher `a` gives higher weightage to model output as compared to NameMatch score.
Baseline MRR = 0.619
MRR for various combinations:
N * P = 0.6346, advantage%=2.5768
N * 1.1^P = 0.6600, advantage%=6.6853
N * **1.2**^P = 0.6669, advantage%=**7.8005**
N * **1.3**^P = 0.6668, advantage%=**7.7795**
N * **1.4**^P = 0.6659, advantage%=**7.6270**
N * 1.5^P = 0.6646, advantage%=7.4200
N * 1.6^P = 0.6636, advantage%=7.2671
N * 1.7^P = 0.6629, advantage%=7.1450
N * 2^P = 0.6612, advantage%=6.8673
N * 2.5^P = 0.6598, advantage%=6.6491
N * 3^P = 0.6590, advantage%=6.5242
N * scaled[0, 1] = 0.6465, advantage%=4.5054
Differential Revision: https://reviews.llvm.org/D88281
Summary:
Structure is parsed from the raw comment using the existing heuristics used
for hover.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79157
Summary:
It is used by code completion and signature help. Code completion
already uses a special no-compile mode for missing preambles, so this change is
a no-op for that.
As for signature help, it already blocks for a preamble and missing it implies
clang has failed to parse the preamble and retrying it in signature help likely
will fail again. And even if it doesn't, request latency will be too high to be
useful as parsing preambles is expensive.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77204
Summary:
This patch adds an instrumentation mode for clangd (enabled by
corresponding option in cc_opts).
If this mode is enabled then user can specify callbacks to run on the
final code completion result.
Moreover the CodeCompletion::Score will contain the detailed Quality and
Relevance signals used to compute the score when this mode is enabled.
These are required because we do not any place in which the final
candidates (scored and sorted) are available along with the above
signals. The signals are temporary structures in `addCandidate`.
The callback is needed as it gives access to many data structures that
are internal to CodeCompleteFlow and are available once Sema has run. Eg:
ScopeDistnace and FileDistance.
If this mode is disabled (as in default) then Score would just contain 2
shared pointers (null). Thus cost(memory/time) increase for the default
mode would be fairly cheap and insignificant.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75603
Summary:
Informative only, useful for positioning UI, interacting with other sources of
completion etc. As requested by an embedder of clangd.
Reviewers: usaxena95
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74305
Summary:
I didn't manage to find something nicer than optional<bool>, but at least I
found a sneakier comment.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64216
llvm-svn: 365356
Summary: Embedding clients want to experiment with showing such results in e.g. a different color.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61588
llvm-svn: 360039
Summary:
o Lex the code to get the identifiers and put them into a "symbol" index.
o Adds a new completion mode without compilation/sema into code completion workflow.
o Make IncludeInserter work even when no compile command is present, by avoiding
inserting non-verbatim headers.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60126
llvm-svn: 358159
Summary: One clear use case: use with an editor that reacts poorly to edits above the cursor.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60409
llvm-svn: 358075
Summary:
Intent is to use the heuristically-parsed scope in cases where we get bogus
results from sema, such as in complex macro expansions.
Added a motivating testcase we currently get wrong.
Name changed because we (already) use this for things other than speculation.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60500
llvm-svn: 358074
Summary:
When calling TUScehduler::runWithPreamble (e.g. in code compleiton), allow
entering a fallback mode when compile command or preamble is not ready, instead of
waiting. This allows clangd to perform naive code completion e.g. using identifiers
in the current file or symbols in the index.
This patch simply returns empty result for code completion in fallback mode. Identifier-based
plus more advanced index-based completion will be added in followup patches.
Reviewers: ilya-biryukov, sammccall
Reviewed By: sammccall
Subscribers: sammccall, javed.absar, MaskRay, jkorous, arphaman, kadircet, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59811
llvm-svn: 357916
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
This patch moves the virtual file system form clang to llvm so it can be
used by more projects.
Concretely the patch:
- Moves VirtualFileSystem.{h|cpp} from clang/Basic to llvm/Support.
- Moves the corresponding unit test from clang to llvm.
- Moves the vfs namespace from clang::vfs to llvm::vfs.
- Formats the lines affected by this change, mostly this is the result of
the added llvm namespace.
RFC on the mailing list:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/126657.html
Differential revision: https://reviews.llvm.org/D52783
llvm-svn: 344140
Summary:
The file stats can be reused when preamble is reused (e.g. code
completion). It's safe to assume that cached status is not outdated as we
assume preamble files to remain unchanged.
On real file system, this made code completion ~20% faster on a measured file
(with big preamble). The preamble build time doesn't change much.
Reviewers: sammccall, ilya-biryukov
Reviewed By: sammccall
Subscribers: mgorny, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52419
llvm-svn: 343576
Summary:
When no scope qualifier is specified, allow completing index symbols
from any scope and insert proper automatically. This is still experimental and
hidden behind a flag.
Things missing:
- Scope proximity based scoring.
- FuzzyFind supports weighted scopes.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: kbobyrev, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52364
llvm-svn: 343248
Summary:
- DynamicIndex doesn't implement ParsingCallbacks, to make its role clearer.
ParsingCallbacks is a separate object owned by the receiving TUScheduler.
(I tried to get rid of the "index-like-object that doesn't implement index"
but it was too messy).
- Clarified(?) docs around DynamicIndex - fewer details up front, more details
inside.
- Exposed dynamic index from ClangdServer for memory monitoring and more
direct testing of its contents (actual tests not added here, wanted to get
this out for review)
- Removed a redundant and sligthly confusing filename param in a callback
Reviewers: ilya-biryukov
Subscribers: javed.absar, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51221
llvm-svn: 341325
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.
In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D51291
llvm-svn: 341304
Summary:
For index-based code completion, send an asynchronous speculative index
request, based on the index request for the last code completion on the same
file and the filter text typed before the cursor, before sema code completion
is invoked. This can reduce the code completion latency (by roughly latency of
sema code completion) if the speculative request is the same as the one
generated for the ongoing code completion from sema. As a sequence of code
completions often have the same scopes and proximity paths etc, this should be
effective for a number of code completions.
Trace with speculative index request:{F6997544}
Reviewers: hokein, ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: javed.absar, jfb, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D50962
llvm-svn: 340604
Summary:
Currently we only add parantheses to the functions if snippets are
enabled, which also inserts snippets for parameters into parantheses. Adding a
new option to put only parantheses. Also it moves the cursor within parantheses
or at the end of them by looking at whether completion item has any parameters
or not. Still requires snippets support on the client side.
Reviewers: ioeric, ilya-biryukov, hokein
Reviewed By: ioeric
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50835
llvm-svn: 340040
Summary:
Sema can only be used for documentation in the current file, other doc
comments should be fetched from the index.
Reviewers: hokein, ioeric, kadircet
Reviewed By: hokein, kadircet
Subscribers: MaskRay, jkorous, mgrang, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D50727
llvm-svn: 340005
Summary:
The following are metrics for explicit member access completions. There is no
noticeable impact on other completion types.
Before:
EXPLICIT_MEMBER_ACCESS
Total measurements: 24382
All measurements: MRR: 62.27 Top10: 80.21% Top-100: 94.48%
Full identifiers: MRR: 98.81 Top10: 99.89% Top-100: 99.95%
0-5 filter len:
MRR: 13.25 46.31 62.47 67.77 70.40 81.91
Top-10: 29% 74% 84% 91% 91% 97%
Top-100: 67% 99% 99% 99% 99% 100%
After:
EXPLICIT_MEMBER_ACCESS
Total measurements: 24382
All measurements: MRR: 63.18 Top10: 80.58% Top-100: 95.07%
Full identifiers: MRR: 98.79 Top10: 99.89% Top-100: 99.95%
0-5 filter len:
MRR: 13.84 48.39 63.55 68.83 71.28 82.64
Top-10: 30% 75% 84% 91% 91% 97%
Top-100: 70% 99% 99% 99% 99% 100%
* Top-N: wanted result is found in the first N completion results.
* MRR: Mean reciprocal rank.
Remark: the change seems to have minor positive impact. Although the improvement
is relatively small, down-ranking non-instance members in instance member access
should reduce noise in the completion results.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D49543
llvm-svn: 337681
Summary: Surface it in the completion items C++ API, and when a flag is set.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48938
llvm-svn: 336309
Summary:
We now compute a distance from the main file to the symbol header, which
is a weighted count of:
- some number of #include traversals from source file --> included file
- some number of FS traversals from file --> parent directory
- some number of FS traversals from parent directory --> child file/dir
This calculation is performed in the appropriate URI scheme.
This means we'll get some proximity boost from header files in main-file
contexts, even when these are in different directory trees.
This extended file proximity model is not yet incorporated in the index
interface/implementation.
Reviewers: ioeric
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48441
llvm-svn: 336177
Summary:
This provides more structured information that embedders can use for rendering.
ClangdLSPServer continues to call render(), so NFC.
The patch is:
- trivial changes to ClangdServer/ClangdLSPServer
- mostly-mechanical updates to CodeCompleteTests etc for the new API
- new direct tests of render() in CodeCompleteTests
- tiny cleanups to CodeCompletionItem (operator<< and missing initializers)
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48821
llvm-svn: 336094
Summary:
LSP has some presentational fields with limited semantics (e.g. 'detail') and
doesn't provide a good place to return information like namespace.
Some places where more detailed information is useful:
- tools like quality analysis
- alternate frontends that aren't strictly LSP
- code completion unit tests
In this patch, ClangdServer::codeComplete still return LSP CompletionList, but
I plan to switch that soon (should be a no-op for ClangdLSPServer).
Deferring this makes it clear that we don't change behavior (tests stay the
same) and also keeps the API break to a small patch which can be reverted.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, cfe-commits, jkorous
Differential Revision: https://reviews.llvm.org/D48762
llvm-svn: 335980
Summary:
For completion items that would trigger include insertions (i.e. index symbols
that are not #included yet), add a visual indicator "+" before the completion
label. The inserted headers will appear in the completion detail.
Open to suggestions for better visual indicators; "+" was picked because it
seems cleaner than a few other candidates I've tried (*, #, @ ...).
The displayed header would be like a/b/c.h (without quote) or <vector> for system
headers. I didn't add quotation or "#include" because they can take up limited
space and do not provide additional information after users know what the
headers are. I think a header alone should be obvious for users to infer that
this is an include header..
To align indentation, also prepend ' ' to labels of candidates that would not
trigger include insertions (only for completions where index results are
possible).
Vim:
{F6357587}
vscode:
{F6357589}
{F6357591}
Reviewers: sammccall, ilya-biryukov, hokein
Reviewed By: sammccall
Subscribers: MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48163
llvm-svn: 334828
Summary:
Adds a CodeCompleteOption to folds together compatible function/method overloads
into a single item. This feels pretty good (for editors with signatureHelp
support), but has limitations.
This happens in the code completion merge step, so there may be inconsistencies
(e.g. if only one overload made it into the index result list, no folding).
We don't want to bundle together completions that have different side-effects
(include insertion), because we can't constructo a coherent CompletionItem.
This may be confusing for users, as the reason for non-bundling may not
be immediately obvious. (Also, the implementation seems a little fragile)
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D47957
llvm-svn: 334822
Summary:
This adds more symbols to the index:
- member variables and functions
- enum constants in scoped enums
The code completion behavior should remain intact but workspace symbols should
now provide much more useful symbols.
Other symbols should be considered such as the ones in "main files" (files not
being included) but this can be done separately as this introduces its fair
share of problems.
Signed-off-by: Marc-Andre Laperle <marc-andre.laperle@ericsson.com>
Reviewers: ioeric, sammccall
Reviewed By: ioeric, sammccall
Subscribers: hokein, sammccall, jkorous, klimek, ilya-biryukov, jkorous-apple, ioeric, MaskRay, cfe-commits
Differential Revision: https://reviews.llvm.org/D44954
llvm-svn: 334017