llvm-project

Commit Graph

Author	SHA1	Message	Date
Utkarsh Saxena	275716d6db	[clangd] Derive new signals in CC from ASTSignals. This patch only introduces new signals but does not use their value in scoring a CC candidate. Usage of these signals in CC ranking in both heiristics and ML model will be introduced in later patches. Differential Revision: https://reviews.llvm.org/D94473	2021-01-18 17:37:27 +01:00
Utkarsh Saxena	7df80a1204	[clangd] Add support for multiple DecisionForest model experiments. With every incremental change, one needs to check-in new model upstream. This also significantly increases the size of the git repo with every new model. Testing and comparing the old and previous model is also not possible as we run only a single model at any point. One solution is to have a "staging" decision forest which can be injected into clangd without pushing it to upstream. Compare the performance of the staging model with the live model. After a couple of enhancements have been done to staging model, we can then replace the live model upstream with the staging model. This reduces upstream churn and also allows us to compare models with current baseline model. This is done by having a callback in CodeCompleteOptions which is called only when we want to use a decision forest ranking model. This allows us to inject different completion model internally. Differential Revision: https://reviews.llvm.org/D90014	2020-10-29 19:49:40 +01:00
Utkarsh Saxena	9b1666f3ce	[clangd] Rename evaluate() to evaluateHeuristics() Since we have 2 scoring functions (heuristics and decision forest), renaming the existing evaluate() function to be more descriptive of the Heuristics being evaluated in it. Differential Revision: https://reviews.llvm.org/D88431	2020-09-28 20:05:01 +02:00
Utkarsh Saxena	a8b55b6939	[clangd] Use Decision Forest to score code completions. By default clangd will score a code completion item using heuristics model. Scoring can be done by Decision Forest model by passing `--ranking_model=decision_forest` to clangd. Features omitted from the model: - `NameMatch` is excluded because the final score must be multiplicative in `NameMatch` to allow rescoring by the editor. - `NeedsFixIts` is excluded because the generating dataset that needs 'fixits' is non-trivial. There are multiple ways (heuristics) to combine the above two features with the prediction of the DF: - `NeedsFixIts` is used as is with a penalty of `0.5`. Various alternatives of combining NameMatch `N` and Decision forest Prediction `P` - N * scale(P, 0, 1): Linearly scale the output of model to range [0, 1] - N * a^P: - More natural: Prediction of each Decision Tree can be considered as a multiplicative boost (like NameMatch) - Ordering is independent of the absolute value of P. Order of two items is proportional to `a^{difference in model prediction score}`. Higher `a` gives higher weightage to model output as compared to NameMatch score. Baseline MRR = 0.619 MRR for various combinations: N * P = 0.6346, advantage%=2.5768 N * 1.1^P = 0.6600, advantage%=6.6853 N * 1.2^P = 0.6669, advantage%=7.8005 N * 1.3^P = 0.6668, advantage%=7.7795 N * 1.4^P = 0.6659, advantage%=7.6270 N * 1.5^P = 0.6646, advantage%=7.4200 N * 1.6^P = 0.6636, advantage%=7.2671 N * 1.7^P = 0.6629, advantage%=7.1450 N * 2^P = 0.6612, advantage%=6.8673 N * 2.5^P = 0.6598, advantage%=6.6491 N * 3^P = 0.6590, advantage%=6.5242 N * scaled[0, 1] = 0.6465, advantage%=4.5054 Differential Revision: https://reviews.llvm.org/D88281	2020-09-28 18:59:29 +02:00
Utkarsh Saxena	158af0d3d1	[clangd] Refactor code completion signal's utility properties. Current implementation of heuristic-based scoring function also contains computation of derived signals (e.g. whether name contains a word from context, computing file distances, scope distances.) This is an attempt to separate out the logic for computation of derived signals from the scoring function. This will allow us to have a clean API for scoring functions that will take only concrete code completion signals as input. Differential Revision: https://reviews.llvm.org/D88146	2020-09-23 16:12:18 +02:00
Ilya Biryukov	54eeb3f40a	[clangd] Remove unused signature help quality signal. NFC ContainsActiveParameter is not used anywhere, set incorrectly (see the removed FIXME) and has no unit tests. Removing it to simplify the code. llvm-svn: 362686	2019-06-06 08:32:25 +00:00
Sam McCall	9fb22b2c86	[clangd] Boost code completion results that were named in the last few lines. Summary: The hope is this will catch a few patterns with repetition: SomeClass* S = ^SomeClass::Create() int getFrobnicator() { return ^frobnicator_; } // discard the factory, it's no longer valid. ^MyFactory.reset(); Without triggering antipatterns too often: return Point(x.first, x.^second); I'm going to gather some data on whether this turns out to be a win overall. Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, kadircet, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D61537 llvm-svn: 360030	2019-05-06 10:25:10 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Eric Liu	5ac37f495a	[clangd] Penalize destructor and overloaded operators in code completion. Reviewers: hokein Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D55061 llvm-svn: 347983	2018-11-30 11:17:15 +00:00
Ilya Biryukov	647da3e8a5	[clangd] Add type boosting in code completion Reviewers: sammccall, ioeric Reviewed By: sammccall Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D52276 llvm-svn: 347562	2018-11-26 15:38:01 +00:00
Eric Liu	52a11b5662	[clangd] Downrank members from base class Reviewers: sammccall, ilya-biryukov Reviewed By: sammccall Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53638 llvm-svn: 345140	2018-10-24 13:45:17 +00:00
Eric Liu	4859738cfe	[clangd] Names that are not spelled in source code are reserved. Summary: These are often not expected to be used directly e.g. ``` TEST_F(Fixture, X) { ^ // "Fixture_X_Test" expanded in the macro should be down ranked. } ``` Only doing this for sema for now, as such symbols are mostly coming from sema e.g. gtest macros expanded in the main file. We could also add a similar field for the index symbol. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53374 llvm-svn: 344736	2018-10-18 12:23:05 +00:00
Eric Liu	3fac4ef1fd	[clangd] Support scope proximity in code completion. Summary: This should make all-scope completion more usable. Scope proximity for indexes will be added in followup patch. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53131 llvm-svn: 344688	2018-10-17 11:19:02 +00:00
Kirill Bobyrev	8e35f1e7cb	NFC: Enforce good formatting across multiple clang-tools-extra files This patch improves readability of multiple files in clang-tools-extra and enforces LLVM Coding Guidelines. Reviewed by: ioeric Differential Revision: https://reviews.llvm.org/D50707 llvm-svn: 339687	2018-08-14 16:03:32 +00:00
Kadir Cetinkaya	e486e37a09	[clangd] Introduce scoring mechanism for SignatureInformations. Reviewers: ilya-biryukov Reviewed By: ilya-biryukov Subscribers: mgrang, ioeric, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D50555 llvm-svn: 339547	2018-08-13 08:40:05 +00:00
Kadir Cetinkaya	2f84d91131	Added functionality to suggest FixIts for conversion of '->' to '.' and vice versa. Summary: Added functionality to suggest FixIts for conversion of '->' to '.' and vice versa. Reviewers: ilya-biryukov Reviewed By: ilya-biryukov Subscribers: yvvan, ioeric, jkorous, arphaman, cfe-commits, kadircet Differential Revision: https://reviews.llvm.org/D50193 llvm-svn: 339224	2018-08-08 08:59:29 +00:00
Eric Liu	d7de81172e	[clangd] Tune down quality score for class constructors so that it's ranked after class types. Summary: Currently, class constructors have the same score as the class types, and they are often ranked before class types. This is often not desireable and can be annoying when snippet is enabled and constructor signatures are added. Metrics: ``` ================================================================================================== OVERALL ================================================================================================== Total measurements: 111117 (+0) All measurements: MRR: 64.06 (+0.20) Top-5: 75.73% (+0.14%) Top-100: 93.71% (+0.01%) Full identifiers: MRR: 98.25 (+0.55) Top-5: 99.04% (+0.03%) Top-100: 99.16% (+0.00%) Filter length 0-5: MRR: 15.23 (+0.02) 50.50 (-0.02) 65.04 (+0.11) 70.75 (+0.19) 74.37 (+0.25) 79.43 (+0.32) Top-5: 40.90% (+0.03%) 74.52% (+0.03%) 87.23% (+0.15%) 91.68% (+0.08%) 93.68% (+0.14%) 95.87% (+0.12%) Top-100: 68.21% (+0.02%) 96.28% (+0.07%) 98.43% (+0.00%) 98.72% (+0.00%) 98.74% (+0.01%) 98.81% (+0.00%) ================================================================================================== DEFAULT ================================================================================================== Total measurements: 57535 (+0) All measurements: MRR: 58.07 (+0.37) Top-5: 69.94% (+0.26%) Top-100: 90.14% (+0.03%) Full identifiers: MRR: 97.13 (+1.05) Top-5: 98.14% (+0.06%) Top-100: 98.34% (+0.00%) Filter length 0-5: MRR: 13.91 (+0.00) 38.53 (+0.01) 55.58 (+0.21) 63.63 (+0.30) 69.23 (+0.47) 72.87 (+0.60) Top-5: 24.99% (+0.00%) 62.70% (+0.06%) 82.80% (+0.30%) 88.66% (+0.16%) 92.02% (+0.27%) 93.53% (+0.21%) Top-100: 51.56% (+0.05%) 93.19% (+0.13%) 97.30% (+0.00%) 97.81% (+0.00%) 97.85% (+0.01%) 97.79% (+0.00%) ``` Remark: - The full-id completions have +1.05 MRR improvement. - There is no noticeable impact on EXPLICIT_MEMBER_ACCESS and WANT_LOCAL. Reviewers: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D49667 llvm-svn: 337816	2018-07-24 08:51:52 +00:00
Eric Liu	5d2a807f25	[clangd] Penalize non-instance members when accessed via class instances. Summary: The following are metrics for explicit member access completions. There is no noticeable impact on other completion types. Before: EXPLICIT_MEMBER_ACCESS Total measurements: 24382 All measurements: MRR: 62.27 Top10: 80.21% Top-100: 94.48% Full identifiers: MRR: 98.81 Top10: 99.89% Top-100: 99.95% 0-5 filter len: MRR: 13.25 46.31 62.47 67.77 70.40 81.91 Top-10: 29% 74% 84% 91% 91% 97% Top-100: 67% 99% 99% 99% 99% 100% After: EXPLICIT_MEMBER_ACCESS Total measurements: 24382 All measurements: MRR: 63.18 Top10: 80.58% Top-100: 95.07% Full identifiers: MRR: 98.79 Top10: 99.89% Top-100: 99.95% 0-5 filter len: MRR: 13.84 48.39 63.55 68.83 71.28 82.64 Top-10: 30% 75% 84% 91% 91% 97% Top-100: 70% 99% 99% 99% 99% 100% * Top-N: wanted result is found in the first N completion results. * MRR: Mean reciprocal rank. Remark: the change seems to have minor positive impact. Although the improvement is relatively small, down-ranking non-instance members in instance member access should reduce noise in the completion results. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D49543 llvm-svn: 337681	2018-07-23 10:56:37 +00:00
Sam McCall	3f0243fdaf	[clangd] Incorporate transitive #includes into code complete proximity scoring. Summary: We now compute a distance from the main file to the symbol header, which is a weighted count of: - some number of #include traversals from source file --> included file - some number of FS traversals from file --> parent directory - some number of FS traversals from parent directory --> child file/dir This calculation is performed in the appropriate URI scheme. This means we'll get some proximity boost from header files in main-file contexts, even when these are in different directory trees. This extended file proximity model is not yet incorporated in the index interface/implementation. Reviewers: ioeric Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48441 llvm-svn: 336177	2018-07-03 08:09:29 +00:00
Eric Liu	09c3c37b72	[clangd] Boost completion score according to file proximity. Summary: Also move unittest: URI scheme to TestFS so that it can be shared by different tests. Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47935 llvm-svn: 334810	2018-06-15 08:58:12 +00:00
Sam McCall	c3b5bad723	[clangd] Boost keyword completions. Summary: These have few signals other than being keywords, so the boost is high. Reviewers: ilya-biryukov Subscribers: ioeric, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48083 llvm-svn: 334711	2018-06-14 13:42:21 +00:00
Sam McCall	e018b36cea	[clangd] Downrank symbols with reserved names (score *= 0.1) Reviewers: ilya-biryukov Subscribers: klimek, ioeric, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47707 llvm-svn: 334274	2018-06-08 09:36:34 +00:00
Sam McCall	4caa85129f	[clangd] Code completion: drop explicit injected names/operators, ignore Sema priority Summary: Now we have most of Sema's code completion signals incorporated in Quality, which will allow us to give consistent ranking to sema/index results. Therefore we can/should stop using Sema priority as an explicit signal. This fixes some issues like namespaces always having a terrible score. The most important missing signals are: - Really dumb/rarely useful completions like: SomeStruct().^SomeStruct SomeStruct().^operator= SomeStruct().~SomeStruct() We already filter out destructors, this patch adds injected names and operators to that list. - type matching the expression context. Ilya has a plan to add this in a way that's compatible with indexes (design doc should be shared real soon now!) Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47871 llvm-svn: 334192	2018-06-07 12:49:17 +00:00
Sam McCall	bc7cbb7895	[clangd] Boost fuzzy match score by 2x (so a maximum of 2) when the query is the full identifier name. Summary: Fix a couple of bugs in tests an in Quality to keep tests passing. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47815 llvm-svn: 334089	2018-06-06 12:38:37 +00:00
Sam McCall	4a3c69ba6e	Adjust symbol score based on crude symbol type. Summary: Numbers are guesses to be adjusted later. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47787 llvm-svn: 334074	2018-06-06 08:53:36 +00:00
Sam McCall	d9b54f0025	[clangd] Boost code completion results that are narrowly scoped (local, members) Summary: This signal is considered a relevance rather than a quality signal because it's dependent on the query (the fact that it's completion, and implicitly the query context). This is part of the effort to reduce reliance on Sema priority, so we can have consistent ranking between Index and Sema results. Reviewers: ioeric Subscribers: klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D47762 llvm-svn: 334026	2018-06-05 16:30:25 +00:00
Sam McCall	db41e1c6da	[clangd] Test tweaks (consistency, shorter, doc). NFC llvm-svn: 334014	2018-06-05 12:22:43 +00:00
Ilya Biryukov	f029646fa3	[clangd] Boost scores for decls from current file in completion Summary: This should, arguably, give better ranking. Reviewers: ioeric, sammccall Reviewed By: sammccall Subscribers: mgorny, klimek, MaskRay, jkorous, mgrang, cfe-commits Differential Revision: https://reviews.llvm.org/D46943 llvm-svn: 333906	2018-06-04 14:50:59 +00:00
Ilya Biryukov	8573defa2d	[clangd] clang-format the source code. NFC llvm-svn: 333537	2018-05-30 12:41:19 +00:00
Sam McCall	c5707b6c36	[clangd] Extract scoring/ranking logic, and shave yaks. Summary: Code completion scoring was embedded in CodeComplete.cpp, which is bad: - awkward to test. The mechanisms (extracting info from index/sema) can be unit-tested well, the policy (scoring) should be quantitatively measured. Neither was easily possible, and debugging was hard. The intermediate signal struct makes this easier. - hard to reuse. This is a bug in workspaceSymbols: it just presents the results in the index order, which is not sorted in practice, it needs to rank them! Also, index implementations care about scoring (both query-dependent and independent) in order to truncate result lists appropriately. The main yak shaved here is the build() function that had 3 variants across unit tests is unified in TestTU.h (rather than adding a 4th variant). Reviewers: ilya-biryukov Subscribers: klimek, mgorny, ioeric, MaskRay, jkorous, mgrang, cfe-commits Differential Revision: https://reviews.llvm.org/D46524 llvm-svn: 332378	2018-05-15 17:43:27 +00:00

30 Commits