llvm-project/clang-tools-extra/clangd/quality/model
Utkarsh Saxena a0a6fd435c [clangd] New CC Ranking Model to fix bad inference due to overflow.
Unreachable file distances are represented as
`std::numeric_limits<unsigned>::max()`.
The previous dataset recorded the signals as `signed int` capturing this default
value as `-1`.

A new dataset was regenerated and a new model is trained that
interprets this unreachable as the intended value.

Distribution of `SymbolScopeDistance`:
Value         Normalised Frequency
0             46.6184
4294967295    29.5342
6             14.5666
4              6.4433
2              1.4534
8              0.5760
10             0.3581
....

Distribution of `FileProximityDistance`:
Value         Normalised Frequency
4294967295    39.9378
12             5.1997
14             4.9828
15             4.4221
16             4.3820
13             4.2765
17             3.8957
11             3.6387
19             3.4799
18             3.4076
....

Differential Revision: https://reviews.llvm.org/D89035
2020-10-08 15:30:00 +02:00
..
features.json [clangd] Add a trained DecisionForest for code completion. 2020-09-28 18:35:10 +02:00
forest.json [clangd] New CC Ranking Model to fix bad inference due to overflow. 2020-10-08 15:30:00 +02:00