Commit Graph

10 Commits

Author SHA1 Message Date
Sam McCall 9c5ebf7039 [include-fixer] Add fuzzy SymbolIndex, where identifier needn't match exactly.
Summary:
Add fuzzy SymbolIndex, where identifier needn't match exactly.

The purpose for this is global autocomplete in clangd. The query will be a
partial identifier up to the cursor, and the results will be suggestions.

It's in include-fixer because:

  - it handles SymbolInfos, actually SymbolIndex is exactly the right interface
  - it's a good harness for lit testing the fuzzy YAML index
  - (Laziness: we can't unit test clangd until reorganizing with a tool/ dir)

Other questionable choices:

  - FuzzySymbolIndex, which just refines the contract of SymbolIndex. This is
    an interface to allow extension to large monorepos (*cough*)
  - an always-true safety check that Identifier == Name is removed from
    SymbolIndexManager, as it's not true for fuzzy matching
  - exposing -db=fuzzyYaml from include-fixer is not a very useful feature, and
    a non-orthogonal ui (fuzziness vs data source). -db=fixed is similar though.

Reviewers: bkramer

Subscribers: cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D30720

llvm-svn: 297630
2017-03-13 15:55:59 +00:00
Sam McCall 573050e703 [include-fixer] Remove line number from Symbol identity
Summary:
Remove line number from Symbol identity.

For our purposes (include-fixer and clangd autocomplete), function overloads
within the same header should mostly be treated as a single combined symbol.

We may want to track individual occurrences (line number, full type info)
and aggregate this during mapreduce, but that's not done here.

Reviewers: hokein, bkramer

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D30685

llvm-svn: 297371
2017-03-09 10:47:44 +00:00
Sam McCall b27dc2245f [include-fixer] Add usage count to find-all-symbols.
Summary:
Add usage count to find-all-symbols.

FindAllSymbols now finds (most!) main-file usages of the discovered symbols.
The per-TU map output has NumUses=0 or 1 (only one use per file is counted).
The reducer aggregates these to find the number of files that use a symbol.

The NumOccurrences is now set to 1 in the mapper rather than being inferred by
the reducer, for consistency.

The idea here is to use NumUses for ranking: intuitively number of files that
use a symbol is more meaningful than number of files that include the header.

Reviewers: hokein, bkramer

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D30210

llvm-svn: 296446
2017-02-28 08:13:15 +00:00
Manuel Klimek a47515ec4a Improve include fixer's ranking by taking the paths into account.
Instead of just using popularity, we also take into account how similar the
path of the current file is to the path of the header.
Our first approach is to get popularity into a reasonably small scale by taking
log2 (which is roughly intuitive to how humans would bucket popularity), and
multiply that with the number of matching prefix path fragments of the included
header with the current file.
Note that currently we do not take special care for unclean paths containing
"../" or "./".

Differential Revision: https://reviews.llvm.org/D28548

llvm-svn: 291664
2017-01-11 10:32:47 +00:00
Benjamin Kramer b53452b2b1 [include-fixer] Be smarter about inserting symbols for a prefix.
If prefix search finds something where nothing can be nested under (e.g.
a variable or macro) don't add it to the result.

This is for cases like:
header.h:
  extern int a;

file.cc:
namespace a {
  SOME_MACRO
}

We will look up a::SOME_MACRO, which doesn't have any results. Then we
look up 'a' and find something before we ever look up just 'SOME_MACRO'.
With some basic filtering we can avoid this case.

Differential Revision: http://reviews.llvm.org/D20960

llvm-svn: 271671
2016-06-03 14:07:38 +00:00
Benjamin Kramer 658d28014b [include-fixer] Rank symbols based on the number of occurrences we found while merging.
This sorts based on the popularity of the header, not the symbol. If
there are mutliple matching symbols in one header we take the maximum
popularity for that header and deduplicate. If we know nothing we sort
lexicographically based on the header path.

Differential Revision: http://reviews.llvm.org/D20814

llvm-svn: 271283
2016-05-31 14:33:28 +00:00
Benjamin Kramer 03016b85b6 [find-all-symbols] Add a test to make sure merging actually works.
llvm-svn: 271270
2016-05-31 12:12:19 +00:00
Eric Liu c893070ff1 [include-fixer] collect the number of times a symbols is found in an indexing run and use it for symbols popularity ranking.
Summary:
[include-fixer] collect the number of times a symbols is found in an
indexing run and use it for symbols popularity ranking.

Reviewers: bkramer

Subscribers: cfe-commits, hokein, djasper

Differential Revision: http://reviews.llvm.org/D20804

llvm-svn: 271268
2016-05-31 12:01:48 +00:00
Haojian Wu 2d07ed4530 [include-fixer] Add lit-test for relative include path.
Reviewers: bkramer

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D20159

llvm-svn: 269177
2016-05-11 12:30:45 +00:00
Haojian Wu d8c12badad [include-fixer] Add Yaml database integration.
Reviewers: bkramer

Subscribers: cfe-commits, klimek, djasper

Differential Revision: http://reviews.llvm.org/D19648

llvm-svn: 268017
2016-04-29 09:23:38 +00:00