llvm-project

Commit Graph

Author	SHA1	Message	Date
Utkarsh Saxena	890190a61d	Revert "Revert "[clangd] Use ML Code completion ranking as default."" The ASAN failure was fixed by `bf935a034b`. This reverts commit `7f086d74c3`.	2021-03-02 18:03:52 +01:00
Utkarsh Saxena	7f086d74c3	Revert "[clangd] Use ML Code completion ranking as default." CodeCompletionContext::Kind has 36 Kinds. The completion model currently only handles categorical features of 32 cardinality. Changing the datatype to uint64_t will solve the problem. This reverts commit `438b5bb05a`.	2021-03-02 15:04:23 +01:00
Utkarsh Saxena	438b5bb05a	[clangd] Use ML Code completion ranking as default. This makes code completion use a Decision Forest based ranking algorithm to rank completion candidates. [Esitmated 6% accuracy boost]. This was previously hidden behind the flag --ranking-model=decision_forest. This patch makes it the default ranking algorithm. Note: this is a generic model, not specialized for any particular project. clangd does not collect or upload data to train code completion. Also treat Keywords separately as they are not recorded by the training set generator. Differential Revision: https://reviews.llvm.org/D96353	2021-03-02 10:05:37 +01:00
Sam McCall	7d1b499cae	Revert "[clangd] Extract symbol-scope logic out of Quality, add tests. NFC" On second thought, this can't properly be reused for highlighting. Consider this example, which Quality wants to consider function-scope, but highlighting must consider class-scope: void foo() { class X { int ^y; }; }	2021-01-29 14:59:16 +01:00
Sam McCall	d0817b5f18	[clangd] Extract symbol-scope logic out of Quality, add tests. NFC This prepares for reuse from the semantic highlighting code. There's a bit of yak-shaving here: - when the enum is moved into the clangd namespace, promote it to a scoped enum. This means teaching the decision forest infrastructure to deal with scoped enums. - AccessibleScope isn't quite the right name: e.g. public class members are treated as accessible, but still have class scope. So rename to SymbolScope. - Rename some QualitySignals members to avoid name conflicts. (the string) SymbolScope -> Scope (the enum) Scope -> ScopeKind	2021-01-29 14:44:28 +01:00
Utkarsh Saxena	17846ed5af	[clangd] Use ASTSignals in Heuristics CC Ranking. Differential Revision: https://reviews.llvm.org/D94927	2021-01-19 19:48:42 +01:00
Utkarsh Saxena	275716d6db	[clangd] Derive new signals in CC from ASTSignals. This patch only introduces new signals but does not use their value in scoring a CC candidate. Usage of these signals in CC ranking in both heiristics and ML model will be introduced in later patches. Differential Revision: https://reviews.llvm.org/D94473	2021-01-18 17:37:27 +01:00
Utkarsh Saxena	9abbc05097	[clangd] Use !empty() instead of size()>0	2021-01-17 15:26:40 +01:00
Utkarsh Saxena	0f9908a7c9	[clangd] Use empty() instead of size()>0	2021-01-17 15:13:01 +01:00
Utkarsh Saxena	d5047d762f	[clangd] Update CC Ranking model with better sampling. A better sampling strategy was used to generate the dataset for this model. New signals introduced in this model: - NumNameInContext: Number of words in the context that matches the name of the candidate. - FractionNameInContext: Fraction of the words in context matching the name of the candidate. We remove the signal `IsForbidden` from the model and down rank forbidden signals aggresively. Differential Revision: https://reviews.llvm.org/D94697	2021-01-15 18:13:24 +01:00
Utkarsh Saxena	f253823398	[clangd] Trivial: Log missing completion signals. Differential Revision: https://reviews.llvm.org/D90828	2020-11-05 18:52:44 +01:00
Utkarsh Saxena	7df80a1204	[clangd] Add support for multiple DecisionForest model experiments. With every incremental change, one needs to check-in new model upstream. This also significantly increases the size of the git repo with every new model. Testing and comparing the old and previous model is also not possible as we run only a single model at any point. One solution is to have a "staging" decision forest which can be injected into clangd without pushing it to upstream. Compare the performance of the staging model with the live model. After a couple of enhancements have been done to staging model, we can then replace the live model upstream with the staging model. This reduces upstream churn and also allows us to compare models with current baseline model. This is done by having a callback in CodeCompleteOptions which is called only when we want to use a decision forest ranking model. This allows us to inject different completion model internally. Differential Revision: https://reviews.llvm.org/D90014	2020-10-29 19:49:40 +01:00
Utkarsh Saxena	9b1666f3ce	[clangd] Rename evaluate() to evaluateHeuristics() Since we have 2 scoring functions (heuristics and decision forest), renaming the existing evaluate() function to be more descriptive of the Heuristics being evaluated in it. Differential Revision: https://reviews.llvm.org/D88431	2020-09-28 20:05:01 +02:00
Utkarsh Saxena	a8b55b6939	[clangd] Use Decision Forest to score code completions. By default clangd will score a code completion item using heuristics model. Scoring can be done by Decision Forest model by passing `--ranking_model=decision_forest` to clangd. Features omitted from the model: - `NameMatch` is excluded because the final score must be multiplicative in `NameMatch` to allow rescoring by the editor. - `NeedsFixIts` is excluded because the generating dataset that needs 'fixits' is non-trivial. There are multiple ways (heuristics) to combine the above two features with the prediction of the DF: - `NeedsFixIts` is used as is with a penalty of `0.5`. Various alternatives of combining NameMatch `N` and Decision forest Prediction `P` - N * scale(P, 0, 1): Linearly scale the output of model to range [0, 1] - N * a^P: - More natural: Prediction of each Decision Tree can be considered as a multiplicative boost (like NameMatch) - Ordering is independent of the absolute value of P. Order of two items is proportional to `a^{difference in model prediction score}`. Higher `a` gives higher weightage to model output as compared to NameMatch score. Baseline MRR = 0.619 MRR for various combinations: N * P = 0.6346, advantage%=2.5768 N * 1.1^P = 0.6600, advantage%=6.6853 N * 1.2^P = 0.6669, advantage%=7.8005 N * 1.3^P = 0.6668, advantage%=7.7795 N * 1.4^P = 0.6659, advantage%=7.6270 N * 1.5^P = 0.6646, advantage%=7.4200 N * 1.6^P = 0.6636, advantage%=7.2671 N * 1.7^P = 0.6629, advantage%=7.1450 N * 2^P = 0.6612, advantage%=6.8673 N * 2.5^P = 0.6598, advantage%=6.6491 N * 3^P = 0.6590, advantage%=6.5242 N * scaled[0, 1] = 0.6465, advantage%=4.5054 Differential Revision: https://reviews.llvm.org/D88281	2020-09-28 18:59:29 +02:00
Utkarsh Saxena	158af0d3d1	[clangd] Refactor code completion signal's utility properties. Current implementation of heuristic-based scoring function also contains computation of derived signals (e.g. whether name contains a word from context, computing file distances, scope distances.) This is an attempt to separate out the logic for computation of derived signals from the scoring function. This will allow us to have a clean API for scoring functions that will take only concrete code completion signals as input. Differential Revision: https://reviews.llvm.org/D88146	2020-09-23 16:12:18 +02:00
Kazuaki Ishizaki	dd5571d51a	[clang-tools-extra] NFC: Fix trivial typo in documents and comments Differential Revision: https://reviews.llvm.org/D77458	2020-04-05 15:28:40 +09:00
Kadir Cetinkaya	84240e0db8	[clang][Index] Introduce a TemplateParm SymbolKind Summary: Currently template parameters has symbolkind `Unknown`. This patch introduces a new kind `TemplateParm` for templatetemplate, templatetype and nontypetemplate parameters. Also adds tests in clangd hover feature. Reviewers: sammccall Subscribers: kristof.beyls, ilya-biryukov, jkorous, arphaman, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73696	2020-02-14 13:20:34 +01:00
Sam McCall	2629035a00	[clangd] Don't assert when completing a lambda variable inside the lambda. Summary: This is a fairly ugly hack - we back off several features for any variable whose type isn't deduced, to avoid computing/caching linkage. Better suggestions welcome. Fixes https://github.com/clangd/clangd/issues/274 Reviewers: kadircet, kbobyrev Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73960	2020-02-04 17:24:26 +01:00
Haojian Wu	6ae86ea275	[clangd] cleanup: unify the implemenation of checking a location is inside main file. Summary: We have variant implementations in the codebase, this patch unifies them. Reviewers: ilya-biryukov, kadircet Subscribers: MaskRay, jkorous, arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D64915 llvm-svn: 366541	2019-07-19 08:33:39 +00:00
Ilya Biryukov	54eeb3f40a	[clangd] Remove unused signature help quality signal. NFC ContainsActiveParameter is not used anywhere, set incorrectly (see the removed FIXME) and has no unit tests. Removing it to simplify the code. llvm-svn: 362686	2019-06-06 08:32:25 +00:00
Sam McCall	9fb22b2c86	[clangd] Boost code completion results that were named in the last few lines. Summary: The hope is this will catch a few patterns with repetition: SomeClass* S = ^SomeClass::Create() int getFrobnicator() { return ^frobnicator_; } // discard the factory, it's no longer valid. ^MyFactory.reset(); Without triggering antipatterns too often: return Point(x.first, x.^second); I'm going to gather some data on whether this turns out to be a win overall. Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, kadircet, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61537 llvm-svn: 360030	2019-05-06 10:25:10 +00:00
Dmitri Gribenko	cb83ea6274	Moved Ref into its own header and implementation file Reviewers: ioeric Subscribers: mgorny, jkorous, mgrang, arphaman, kadircet, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58778 llvm-svn: 355090	2019-02-28 13:49:25 +00:00
Sam McCall	a4cf26b499	[clangd] Penalize file-scope symbols in the ranking for non-completion queries Patch by Nathan Ridge! Differential Revision: https://reviews.llvm.org/D56653 llvm-svn: 352868	2019-02-01 13:07:37 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Ilya Biryukov	f2001aa743	[clangd] Remove 'using namespace llvm' from .cpp files. NFC The new guideline is to qualify with 'llvm::' explicitly both in '.h' and '.cpp' files. This simplifies moving the code between header and source files and is easier to keep consistent. llvm-svn: 350531	2019-01-07 15:45:19 +00:00
Eric Liu	5ac37f495a	[clangd] Penalize destructor and overloaded operators in code completion. Reviewers: hokein Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D55061 llvm-svn: 347983	2018-11-30 11:17:15 +00:00
Eric Liu	1208240ac9	[clangd] Fix test broken in r347754. llvm-svn: 347755	2018-11-28 14:00:09 +00:00
Eric Liu	e9a33b7ece	[clangd] Less penalty for cross-namespace completions. llvm-svn: 347754	2018-11-28 13:45:25 +00:00
Ilya Biryukov	647da3e8a5	[clangd] Add type boosting in code completion Reviewers: sammccall, ioeric Reviewed By: sammccall Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D52276 llvm-svn: 347562	2018-11-26 15:38:01 +00:00
Simon Pilgrim	5369168983	Fix MSVC "truncation from 'double' to 'float'" warnings. NFCI. llvm-svn: 345184	2018-10-24 19:31:24 +00:00
Eric Liu	52a11b5662	[clangd] Downrank members from base class Reviewers: sammccall, ilya-biryukov Reviewed By: sammccall Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53638 llvm-svn: 345140	2018-10-24 13:45:17 +00:00
Sam McCall	c008af6466	[clangd] Namespace style cleanup in cpp files. NFC. Standardize on the most common namespace setup in our *.cpp files: using namespace llvm; namespace clang { namespace clangd { void foo(StringRef) { ... } And remove redundant llvm:: qualifiers. (Except for cases like make_unique where this causes problems with std:: and ADL). This choice is pretty arbitrary, but some broad consistency is nice. This is going to conflict with everything. Sorry :-/ Squash the other configurations: A) using namespace llvm; using namespace clang; using namespace clangd; void clangd::foo(StringRef); This is in some of the older files. (It prevents accidentally defining a new function instead of one in the header file, for what that's worth). B) namespace clang { namespace clangd { void foo(llvm::StringRef) { ... } This is fine, but in practice the using directive often gets added over time. C) namespace clang { namespace clangd { using namespace llvm; // inside the namespace This was pretty common, but is a bit misleading: name lookup preferrs clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using directive is). llvm-svn: 344850	2018-10-20 15:30:37 +00:00
Simon Pilgrim	15ee23fc40	Fix MSVC "truncation from 'double' to 'float'" warning. NFCI. llvm-svn: 344845	2018-10-20 13:20:26 +00:00
Eric Liu	4859738cfe	[clangd] Names that are not spelled in source code are reserved. Summary: These are often not expected to be used directly e.g. ``` TEST_F(Fixture, X) { ^ // "Fixture_X_Test" expanded in the macro should be down ranked. } ``` Only doing this for sema for now, as such symbols are mostly coming from sema e.g. gtest macros expanded in the main file. We could also add a similar field for the index symbol. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53374 llvm-svn: 344736	2018-10-18 12:23:05 +00:00
Eric Liu	3fac4ef1fd	[clangd] Support scope proximity in code completion. Summary: This should make all-scope completion more usable. Scope proximity for indexes will be added in followup patch. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53131 llvm-svn: 344688	2018-10-17 11:19:02 +00:00
Eric Liu	6df66001ee	[clangd] Add "Deprecated" field to Symbol and CodeCompletion. Summary: Also set "deprecated" field in LSP CompletionItem. Reviewers: sammccall, kadircet Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D51724 llvm-svn: 341576	2018-09-06 18:52:26 +00:00
Eric Liu	f592d281a7	[clangd] Tune macro quality scoring for code completion. x0.2 seems to be too much penalty, macros might be wanted in some cases; changing to 0.5x instead. The tuning didn't affect ranking for non-macro completions. llvm-svn: 341449	2018-09-05 07:40:38 +00:00
Kirill Bobyrev	8e35f1e7cb	NFC: Enforce good formatting across multiple clang-tools-extra files This patch improves readability of multiple files in clang-tools-extra and enforces LLVM Coding Guidelines. Reviewed by: ioeric Differential Revision: https://reviews.llvm.org/D50707 llvm-svn: 339687	2018-08-14 16:03:32 +00:00
Kadir Cetinkaya	e486e37a09	[clangd] Introduce scoring mechanism for SignatureInformations. Reviewers: ilya-biryukov Reviewed By: ilya-biryukov Subscribers: mgrang, ioeric, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D50555 llvm-svn: 339547	2018-08-13 08:40:05 +00:00
Kadir Cetinkaya	2f84d91131	Added functionality to suggest FixIts for conversion of '->' to '.' and vice versa. Summary: Added functionality to suggest FixIts for conversion of '->' to '.' and vice versa. Reviewers: ilya-biryukov Reviewed By: ilya-biryukov Subscribers: yvvan, ioeric, jkorous, arphaman, cfe-commits, kadircet Differential Revision: https://reviews.llvm.org/D50193 llvm-svn: 339224	2018-08-08 08:59:29 +00:00
Ilya Biryukov	74f2655dc7	[clangd] Fix (most) naming warnings from clang-tidy. NFC llvm-svn: 338021	2018-07-26 12:05:31 +00:00
Eric Liu	84bd5db209	[clangd] Use a sigmoid style function for #usages boost in symbol quality. Summary: This has a shape to similar logarithm function but grows much slower for large #usages. Metrics: https://reviews.llvm.org/P8096 Reviewers: ilya-biryukov Reviewed By: ilya-biryukov Subscribers: MaskRay, jkorous, arphaman, cfe-commits, sammccall Differential Revision: https://reviews.llvm.org/D49780 llvm-svn: 337907	2018-07-25 11:26:35 +00:00
Eric Liu	d7de81172e	[clangd] Tune down quality score for class constructors so that it's ranked after class types. Summary: Currently, class constructors have the same score as the class types, and they are often ranked before class types. This is often not desireable and can be annoying when snippet is enabled and constructor signatures are added. Metrics: ``` ================================================================================================== OVERALL ================================================================================================== Total measurements: 111117 (+0) All measurements: MRR: 64.06 (+0.20) Top-5: 75.73% (+0.14%) Top-100: 93.71% (+0.01%) Full identifiers: MRR: 98.25 (+0.55) Top-5: 99.04% (+0.03%) Top-100: 99.16% (+0.00%) Filter length 0-5: MRR: 15.23 (+0.02) 50.50 (-0.02) 65.04 (+0.11) 70.75 (+0.19) 74.37 (+0.25) 79.43 (+0.32) Top-5: 40.90% (+0.03%) 74.52% (+0.03%) 87.23% (+0.15%) 91.68% (+0.08%) 93.68% (+0.14%) 95.87% (+0.12%) Top-100: 68.21% (+0.02%) 96.28% (+0.07%) 98.43% (+0.00%) 98.72% (+0.00%) 98.74% (+0.01%) 98.81% (+0.00%) ================================================================================================== DEFAULT ================================================================================================== Total measurements: 57535 (+0) All measurements: MRR: 58.07 (+0.37) Top-5: 69.94% (+0.26%) Top-100: 90.14% (+0.03%) Full identifiers: MRR: 97.13 (+1.05) Top-5: 98.14% (+0.06%) Top-100: 98.34% (+0.00%) Filter length 0-5: MRR: 13.91 (+0.00) 38.53 (+0.01) 55.58 (+0.21) 63.63 (+0.30) 69.23 (+0.47) 72.87 (+0.60) Top-5: 24.99% (+0.00%) 62.70% (+0.06%) 82.80% (+0.30%) 88.66% (+0.16%) 92.02% (+0.27%) 93.53% (+0.21%) Top-100: 51.56% (+0.05%) 93.19% (+0.13%) 97.30% (+0.00%) 97.81% (+0.00%) 97.85% (+0.01%) 97.79% (+0.00%) ``` Remark: - The full-id completions have +1.05 MRR improvement. - There is no noticeable impact on EXPLICIT_MEMBER_ACCESS and WANT_LOCAL. Reviewers: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D49667 llvm-svn: 337816	2018-07-24 08:51:52 +00:00
Eric Liu	5d2a807f25	[clangd] Penalize non-instance members when accessed via class instances. Summary: The following are metrics for explicit member access completions. There is no noticeable impact on other completion types. Before: EXPLICIT_MEMBER_ACCESS Total measurements: 24382 All measurements: MRR: 62.27 Top10: 80.21% Top-100: 94.48% Full identifiers: MRR: 98.81 Top10: 99.89% Top-100: 99.95% 0-5 filter len: MRR: 13.25 46.31 62.47 67.77 70.40 81.91 Top-10: 29% 74% 84% 91% 91% 97% Top-100: 67% 99% 99% 99% 99% 100% After: EXPLICIT_MEMBER_ACCESS Total measurements: 24382 All measurements: MRR: 63.18 Top10: 80.58% Top-100: 95.07% Full identifiers: MRR: 98.79 Top10: 99.89% Top-100: 99.95% 0-5 filter len: MRR: 13.84 48.39 63.55 68.83 71.28 82.64 Top-10: 30% 75% 84% 91% 91% 97% Top-100: 70% 99% 99% 99% 99% 100% * Top-N: wanted result is found in the first N completion results. * MRR: Mean reciprocal rank. Remark: the change seems to have minor positive impact. Although the improvement is relatively small, down-ranking non-instance members in instance member access should reduce noise in the completion results. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D49543 llvm-svn: 337681	2018-07-23 10:56:37 +00:00
Kirill Bobyrev	47d7f52dea	[clangd] Uprank delcarations when "using q::name" is present in the main file Having `using qualified::name;` for some symbol is an important signal for clangd code completion as the user is more likely to use such symbol. This patch helps to uprank the relevant symbols by saving UsingShadowDecl in the new field of CodeCompletionResult and checking whether the corresponding UsingShadowDecl is located in the main file later in ClangD code completion routine. While the relative importance of such signal is a subject to change in the future, this patch simply bumps DeclProximity score to the value of 1.0 which should be enough for now. The patch was tested using `$ ninja check-clang check-clang-tools` No unexpected failures were noticed after running the relevant testsets. Reviewers: sammccall, ioeric Subscribers: MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D49012 llvm-svn: 336810	2018-07-11 14:49:49 +00:00
Kirill Bobyrev	7cf29bc028	[NFS] Wipe trailing whitespaces This patch is a preparation for another one containing meaningful changes. This patch simply removes trailing whitespaces in few files affected by the upcoming patch and reformats llvm-svn: 336330	2018-07-05 09:37:26 +00:00
Simon Pilgrim	4a03201324	Fix -Wunused-variable warning. NFCI. llvm-svn: 336329	2018-07-05 09:35:12 +00:00
Eric Liu	8944f0ef33	[clangd] Treat class constructor as in the same scope as the class in ranking. Reviewers: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48933 llvm-svn: 336318	2018-07-05 08:14:04 +00:00
Sam McCall	3f0243fdaf	[clangd] Incorporate transitive #includes into code complete proximity scoring. Summary: We now compute a distance from the main file to the symbol header, which is a weighted count of: - some number of #include traversals from source file --> included file - some number of FS traversals from file --> parent directory - some number of FS traversals from parent directory --> child file/dir This calculation is performed in the appropriate URI scheme. This means we'll get some proximity boost from header files in main-file contexts, even when these are in different directory trees. This extended file proximity model is not yet incorporated in the index interface/implementation. Reviewers: ioeric Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48441 llvm-svn: 336177	2018-07-03 08:09:29 +00:00
Eric Liu	cdc5f6ad5c	[clangd] Use log10 instead of the natural logrithm for usage boost. llvm-svn: 335874	2018-06-28 16:51:12 +00:00

1 2

67 Commits