Commit Graph

11 Commits

Author SHA1 Message Date
Utkarsh Saxena bf935a034b [clangd] Make categorical features 64 bit in DecisionForest Model.
CodeCompletionContext::Kind has 36 Kinds. The completion model used to
support categorical features of 32 cardinality.
Due to this clangd tests were failing asan tests due to overflow.

This patch makes the completion model support 64 cardinality of
categorical features by storing ENUM Features as uint64_t instead of
uint32_t.

Verified that this fixes the asan failures.

Latency: 6.7ms (old) VS 6.8ms (new) per 1000 predictions.

Differential Revision: https://reviews.llvm.org/D97770
2021-03-02 16:22:30 +01:00
Sam McCall 7d1b499cae Revert "[clangd] Extract symbol-scope logic out of Quality, add tests. NFC"
On second thought, this can't properly be reused for highlighting.

Consider this example, which Quality wants to consider function-scope,
but highlighting must consider class-scope:

void foo() {
  class X {
    int ^y;
  };
}
2021-01-29 14:59:16 +01:00
Sam McCall d0817b5f18 [clangd] Extract symbol-scope logic out of Quality, add tests. NFC
This prepares for reuse from the semantic highlighting code.

There's a bit of yak-shaving here:
 - when the enum is moved into the clangd namespace, promote it to a
   scoped enum. This means teaching the decision forest infrastructure
   to deal with scoped enums.
 - AccessibleScope isn't quite the right name: e.g. public class members
   are treated as accessible, but still have class scope. So rename to
   SymbolScope.
 - Rename some QualitySignals members to avoid name conflicts.
   (the string) SymbolScope -> Scope
   (the enum) Scope -> ScopeKind
2021-01-29 14:44:28 +01:00
Aaron Ballman 45e0f65162 Add a floating-point suffix to silence warnings; NFC
This silences about 6000 warnings about truncating from double to float
with Visual Studio.
2020-11-04 10:09:51 -05:00
Utkarsh Saxena 45698ac005 [clangd] Split DecisionForest Evaluate() into one func per tree.
This allows us MSAN to instrument this function. Previous version is not
instrumentable due to it shear volume.

Differential Revision: https://reviews.llvm.org/D88536
2020-10-01 18:07:23 +02:00
Utkarsh Saxena a9f63d22fa [clangd] Disable msan instrumentation for generated Evaluate().
MSAN build times out for generated DecisionForest inference runtime.

A solution worth trying is splitting the function into 300 smaller
functions and then re-enable msan.

For now we are disabling instrumentation for the generated function.

Differential Revision: https://reviews.llvm.org/D88495
2020-09-29 17:44:10 +02:00
Utkarsh Saxena 985deba931 Revert "Temporarily Revert "[clangd] Add Random Forest runtime for code completion.""
We intend to replace heuristics based code completion ranking with a Decision Forest Model.

This patch introduces a format for representing the model and an inference runtime that is code-generated at build time.
- Forest.json contains all the trees as an array of trees.
- Features.json describes the features to be used.
- Codegen file takes the above two files and generates CompletionModel containing Feature struct and corresponding Evaluate function.
   The Evaluate function maps a feature to a real number describing the relevance of this candidate.
- The codegen is part of build system and these files are generated at build time.
- Proposes a way to test the generated runtime using a test model.
  - Replicates the model structure in unittests.
  - unittest tests both the test model (for correct tree traversal) and the real model (for sanity).

This reverts commit 549e55b3d5.
2020-09-19 10:54:04 +02:00
Eric Christopher 549e55b3d5 Temporarily Revert "[clangd] Add Random Forest runtime for code completion."
as a header doesn't appear to have made it into the commit.

This reverts commit 9b6765e784 and followup
2020-09-18 14:47:43 -07:00
Nico Weber 807777913e CompletionModelCodegen: Remove unused import
The unused import is 3.4+, so it also breaks py2.7 compat.
But this is easy to fix :)
2020-09-18 16:24:58 -04:00
Nico Weber 0ea2a57274 clangd: Make ompletionModelCodegen.py tpy2.7 compatible
LLVM still supports Python 2.7, so unbreak bots that still run that.
In a separate commit so that this is easy to revert once we drop
support :)
2020-09-18 15:26:58 -04:00
Utkarsh Saxena 9b6765e784 [clangd] Add Random Forest runtime for code completion.
Summary:
[WIP]
- Proposes a json format for representing Random Forest model.
- Proposes a way to test the generated runtime using a test model.

TODO:
- Add generated source code snippet for easier review.
- Fix unused label warning.
- Figure out required using declarations for CATEGORICAL columns from Features.json.
- Necessary Google3 internal modifications for blaze before landing.
- Add documentation for format of the model.
- Document more.

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D83814
2020-09-18 19:25:56 +02:00