Commit Graph

383 Commits

Author SHA1 Message Date
Ilya Biryukov 1c48f0383c [clangd] Make background index less chatty
Summary:
It is producing too much input in non-verbose mode,
i.e. a message per indexed file

Reviewers: sammccall, kadircet

Reviewed By: sammccall

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56915

llvm-svn: 351563
2019-01-18 17:04:26 +00:00
Kadir Cetinkaya 226af75a02 [clangd] Fix updated file detection logic in indexing
Summary:
Files without any symbols were never marked as updated during indexing, which resulted in failure while writing shards for these files.

This patch fixes the logic to mark files that are seen for the first time but don't contain any symbols as updated.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D56592

llvm-svn: 351170
2019-01-15 09:03:33 +00:00
Haojian Wu c34f022bfe [clangd] Add Limit parameter for xref.
Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56597

llvm-svn: 351081
2019-01-14 18:11:09 +00:00
Kadir Cetinkaya 560b853ccf [clangd] Fix a reference invalidation
Summary: Fix for the breakage in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52811/consoleFull#-42777206a1ca8a51-895e-46c6-af87-ce24fa4cd561

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D56656

llvm-svn: 351052
2019-01-14 11:24:07 +00:00
Sam McCall 0e93b076c4 [clangd] Index main-file symbols (bug 39761)
Patch by Nathan Ridge!

Differential Revision: https://reviews.llvm.org/D55185

llvm-svn: 351041
2019-01-14 10:01:17 +00:00
Kadir Cetinkaya 99b060e447 [clangd] Introduce loading of shards within auto-index
Summary:
Whenever a change happens on a CDB, load shards associated with that
CDB before issuing re-index actions.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55224

llvm-svn: 350847
2019-01-10 17:03:04 +00:00
Haojian Wu 8f85b9f867 [clangd] Don't store completion info if the symbol is not used for code completion.
Summary:
This would save us some memory and disk space:
  - Dex usage (261 MB vs 266 MB)
  - Disk (75 MB vs 76 MB)

It would save more when we index the main file symbol D55185.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: nridge, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56314

llvm-svn: 350803
2019-01-10 09:22:40 +00:00
Haojian Wu 073d184ee3 [clangd] Fix a crash when reading an empty index file.
Summary:
Unfortunately, yaml::Input::setCurrentDocument() and yaml::Input::nextDocument() are
internal APIs, the way we use them may cause a nullptr accessing when
processing an empty YAML file.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56442

llvm-svn: 350633
2019-01-08 15:24:47 +00:00
Ilya Biryukov f2001aa743 [clangd] Remove 'using namespace llvm' from .cpp files. NFC
The new guideline is to qualify with 'llvm::' explicitly both in
'.h' and '.cpp' files. This simplifies moving the code between
header and source files and is easier to keep consistent.

llvm-svn: 350531
2019-01-07 15:45:19 +00:00
Haojian Wu b2d7e269d5 [clangd] Don't miss the expected type in merge.
Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D55918

llvm-svn: 349750
2018-12-20 13:05:46 +00:00
Kadir Cetinkaya dd67793c0c [clangd] Unify path canonicalizations in the codebase
Summary:
There were a few different places where we canonicalized paths, each
one had its own flavor. This patch tries to unify them all under one place.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55818

llvm-svn: 349618
2018-12-19 10:46:21 +00:00
Eric Liu 667e8ef7e1 [clangd] BackgroundIndex rebuilds symbol index periodically.
Summary:
Currently, background index rebuilds symbol index on every indexed file,
which can be inefficient. This patch makes it only rebuild symbol index periodically.
As the rebuild no longer happens too often, we could also build more efficient
dex index.

Reviewers: ilya-biryukov, kadircet

Reviewed By: kadircet

Subscribers: dblaikie, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D55770

llvm-svn: 349496
2018-12-18 15:39:33 +00:00
Kadir Cetinkaya e913b956aa [clangd] Change diskbackedstorage to be atomic
Summary:
There was a chance that multiple clangd instances could try to write
same shard, in which case we would get a malformed file most likely. This patch
changes the writing mechanism to first write to a temporary file and then rename
it to fit real destination. Which is guaranteed to be atomic by POSIX.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D55417

llvm-svn: 349348
2018-12-17 12:38:22 +00:00
Kadir Cetinkaya 375c54fd1e [clangd] Only reduce priority of a thread for indexing.
Summary:
We'll soon have tasks pending for reading shards from disk, we want
them to have normal priority. Because:
- They are not CPU intensive, mostly IO bound.
- Give a good coverage for the project at startup, therefore it is worth
  spending some cycles.
- We have only one task per whole CDB rather than one task per file.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D55315

llvm-svn: 349345
2018-12-17 12:30:27 +00:00
Kadir Cetinkaya 1b65b376ae [dexp] Change FuzzyFind to also print scope of symbols
Summary:
When there are multiple symbols in the result of a fuzzy find with the
same name, one has to perform an additional query to figure out which of those
symbols are coming from the "interesting" scope. This patch prints the scope in
fuzzy find results to get rid of the second symbol.

Reviewers: hokein

Subscribers: ilya-biryukov, ioeric, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55705

llvm-svn: 349152
2018-12-14 14:17:18 +00:00
Haojian Wu d5a78e6e59 [clangd] Fix an assertion failure in background index.
Summary:
When indexing a file which contains an uncompilable error, we will
trigger an assertion failure -- the IndexFileIn data is not set, but we
access them in the backgound index.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55650

llvm-svn: 349144
2018-12-14 12:39:08 +00:00
Haojian Wu 9d0d9f884c [clangd] Move the utility function to anonymous namespace, NFC.
llvm-svn: 349031
2018-12-13 13:07:29 +00:00
Kadir Cetinkaya 219c0fae5c [clangd] Partition include graph on auto-index.
Summary:
Partitions include graphs in auto-index so that each shards contains
only part of the include graph related to itself.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55062

llvm-svn: 348252
2018-12-04 11:31:57 +00:00
Haojian Wu 7800dbe157 [clangd] Fix a stale comment, NFC.
llvm-svn: 348133
2018-12-03 13:16:04 +00:00
Kadir Cetinkaya 5399552da1 [clangd] Populate include graph during static indexing action.
Summary:
This is the second part for introducing include hierarchy into index
files produced by clangd. You can see the base patch that introduces structures
and discusses the future of the patches in D54817

Reviewers: ilya-biryukov

Subscribers: mgorny, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54999

llvm-svn: 348005
2018-11-30 16:59:00 +00:00
Jan Korous 6089b6192e [clangd][NFC] Move SymbolID to a separate file
Prerequisity for textDocument/SymbolInfo

Differential Revision: https://reviews.llvm.org/D54799

llvm-svn: 347674
2018-11-27 16:40:34 +00:00
Kadir Cetinkaya d08eab4281 [clangd] Put direct headers into srcs section.
Summary:
Currently, there's no way of knowing about header files
using compilation database, since it doesn't contain header files as entries.

Using this information, restoring from cache using compile commands becomes
possible instead of doing directory traversal. Also, we can issue indexing
actions for out-of-date headers even if source files depending on them haven't
changed.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54817

llvm-svn: 347669
2018-11-27 16:08:53 +00:00
Sam McCall 422c828dfc [clangd] Enable auto-index behind a flag.
Summary:
Ownership and configuration:
The auto-index (background index) is maintained by ClangdServer, like Dynamic.
(This means ClangdServer will be able to enqueue preamble indexing in future).
For now it's enabled by a simple boolean flag in ClangdServer::Options, but
we probably want to eventually allow injecting the storage strategy.

New 'sync' command:
In order to meaningfully test the integration (not just unit-test components)
we need a way for tests to ensure the asynchronous index reads/writes occur
before a certain point.
Because these tests and assertions are few, I think exposing an explicit "sync"
command for use in tests is simpler than allowing threading to be completely
disabled in the background index (as we do for TUScheduler).

Bugs:
I fixed a couple of trivial bugs I found while testing, but there's one I can't.
JSONCompilationDatabase::getAllFiles() may return relative paths, and currently
we trigger an assertion that assumes they are absolute.
There's no efficient way to resolve them (you have to retrieve the corresponding
command and then resolve against its directory property). In general I think
this behavior is broken and we should fix it in JSONCompilationDatabase and
require CompilationDatabase::getAllFiles() to be absolute.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54894

llvm-svn: 347567
2018-11-26 16:00:11 +00:00
Ilya Biryukov 4d3d82eef9 [clangd] Fix use-after-free with expected types in indexing
llvm-svn: 347563
2018-11-26 15:52:16 +00:00
Ilya Biryukov 647da3e8a5 [clangd] Add type boosting in code completion
Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52276

llvm-svn: 347562
2018-11-26 15:38:01 +00:00
Ilya Biryukov a21392bfc7 [clangd] Collect and store expected types in the index
Summary:
And add a hidden option to control whether the types are collected.
For experiments, will be removed when expected types implementation
is stabilized.

The index size is almost unchanged, e.g. the YAML index for all clangd
sources increased from 53MB to 54MB.

Reviewers: ioeric, sammccall

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52274

llvm-svn: 347560
2018-11-26 15:29:14 +00:00
Sam McCall 7d0e4848ad [clangd] Fix missing include from r347538 - fix windows buildbots
llvm-svn: 347554
2018-11-26 13:35:02 +00:00
Sam McCall 6e2d2a33b6 [clangd] Auto-index watches global CDB for changes.
Summary:
Instead of receiving compilation commands, auto-index is triggered by just
filenames to reindex, and gets commands from the global comp DB internally.
This has advantages:
 - more of the work can be done asynchronously (fetching compilation commands
   upfront can be slow for large CDBs)
 - we get access to the CDB which can be used to retrieve interpolated commands
   for headers (useful in some cases where the original TU goes away)
 - fits nicely with the filename-only change observation from r347297

The interface to GlobalCompilationDatabase gets extended: when retrieving a
compile command, the GCDB can optionally report the project the file belongs to.
This naturally fits together with getCompileCommand: it's hard to implement one
without the other. But because most callers don't care, I've ended up with an
awkward optional-out-param-in-virtual method pattern - maybe there's a better
one.

This is the main missing integration point between ClangdServer and
BackgroundIndex, after this we should be able to add an auto-index flag.

Reviewers: ioeric, kadircet

Subscribers: MaskRay, jkorous, arphaman, cfe-commits, ilya-biryukov

Differential Revision: https://reviews.llvm.org/D54865

llvm-svn: 347538
2018-11-26 09:51:50 +00:00
Eric Liu c0ac4bb17c [clangd] Cleanup: stop passing around list of supported URI schemes.
Summary:
Instead of passing around a list of supported URI schemes in clangd, we
expose an interface to convert a path to URI using any compatible scheme
that has been registered. It favors customized schemes and falls
back to "file" when no other scheme works.

Changes in this patch are:
- URI::create(AbsPath, URISchemes) -> URI::create(AbsPath). The new API finds a
compatible scheme from the registry.
- Remove URISchemes option everywhere (ClangdServer, SymbolCollecter, FileIndex etc).
- Unit tests will use "unittest" by default.
- Move "test" scheme from ClangdLSPServer to ClangdMain.cpp, and only
register the test scheme when lit-test or enable-lit-scheme is set.
(The new flag is added to make lit protocol.test work; I wonder if there
is alternative here.)

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54800

llvm-svn: 347467
2018-11-22 15:02:05 +00:00
Kadir Cetinkaya dd91a36422 Address comments.
llvm-svn: 347237
2018-11-19 18:06:36 +00:00
Kadir Cetinkaya 244ac0dba0 Use digest size instead of hardcoding it.
llvm-svn: 347236
2018-11-19 18:06:33 +00:00
Kadir Cetinkaya ca9e5dc714 [clangd] Store source file hash in IndexFile{In,Out}
Summary:
Puts the digest of the source file that generated the index into
serialized index and stores them back on load, if exists.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54693

llvm-svn: 347235
2018-11-19 18:06:29 +00:00
Haojian Wu 22c9f7b296 [clangd] Truncate SymbolID to 8 bytes.
Summary:
This is our goal. It has a non-zero rick, but so far we haven't see any
collision (externally and internally).

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54622

llvm-svn: 347044
2018-11-16 10:58:40 +00:00
Haojian Wu 1bf52c59b7 [clangd] Fix a compiler warning and test crashes caused in rL347038.
llvm-svn: 347039
2018-11-16 09:41:14 +00:00
Kadir Cetinkaya 06553bfe96 Introduce shard storage to auto-index.
Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: llvm-commits, mgorny, Eugene.Zelenko, ilya-biryukov, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54269

llvm-svn: 347038
2018-11-16 09:03:56 +00:00
Haojian Wu fd4d45514f [clangd] global-symbol-builder => clangd-indexer
llvm-svn: 346955
2018-11-15 14:15:19 +00:00
Haojian Wu 5e7486f518 [clangd] Fix no results returned for global symbols in dexp
Summary:
For symbols in global namespace (without any scope), we need to
add global scope "" to the fuzzy request.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54519

llvm-svn: 346947
2018-11-15 12:17:41 +00:00
Kadir Cetinkaya 5a9b92ca75 Revert "Introduce shard storage to auto-index."
This reverts commit 6dd1f24aead10a8d375d0311001987198d26e900.

llvm-svn: 346945
2018-11-15 10:34:47 +00:00
Kadir Cetinkaya bd2441c887 Revert "clang-format"
This reverts commit 0a37e9c3d88a2e21863657df2f7735fb7e5f746e.

llvm-svn: 346944
2018-11-15 10:34:43 +00:00
Kadir Cetinkaya ed18e788f0 Revert "Address comments"
This reverts commit 19a39b14eab2b5339325e276262b177357d6b412.

llvm-svn: 346943
2018-11-15 10:34:39 +00:00
Kadir Cetinkaya 8b9fed3e8d Revert "Address comments."
This reverts commit b43c4d1c731e07172a382567f3146b3c461c5b69.

llvm-svn: 346942
2018-11-15 10:34:35 +00:00
Kadir Cetinkaya 2bed2cf791 Address comments.
llvm-svn: 346941
2018-11-15 10:31:23 +00:00
Kadir Cetinkaya 89a7691fd9 Address comments
llvm-svn: 346940
2018-11-15 10:31:19 +00:00
Kadir Cetinkaya cb8407ca89 clang-format
llvm-svn: 346939
2018-11-15 10:31:15 +00:00
Kadir Cetinkaya 3e5a47560c Introduce shard storage to auto-index.
Reviewers: sammccall, ioeric

Subscribers: ilya-biryukov, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54269

llvm-svn: 346938
2018-11-15 10:31:10 +00:00
Haojian Wu ee54a2b501 [clangd] Replace StringRef in SymbolLocation with a char pointer.
Summary:
This would save us 8 bytes per ref, and buy us ~40MB in total
for llvm index (from ~300MB to ~260 MB).

The char pointer must be null-terminated, and llvm::StringSaver
guarantees it.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53427

llvm-svn: 346852
2018-11-14 11:55:45 +00:00
Haojian Wu 172c045590 [clangd] Don't show all refs results if -name is ambiguous in dexp.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54430

llvm-svn: 346671
2018-11-12 16:41:15 +00:00
Haojian Wu 62fb2a216e [clangd] Allow symbols from AnyScope in dexp.
Summary:
We should allow symbols from any scope in dexp results, othewise
`find StringRef` doesn't return any results (llvm::StringRef).

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54427

llvm-svn: 346666
2018-11-12 16:03:59 +00:00
Eric Liu 961024f174 [clangd] Remember to serialize AnyScope in FuzzyFindRequest json.
llvm-svn: 346648
2018-11-12 12:24:08 +00:00
Haojian Wu f761a2c620 [clangd] Drop namespace references in the index.
Summary:
Namespace references is less useful compared with other symbols, and
they contribute large part of the index. This patch drops them.
The number of refs is reduced from 5.4 million to 4.7 million.

|           |  Before | After |
|file size  |  78 MB  |  71MB |
|memory     |  330MB  |  300MB|

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54202

llvm-svn: 346319
2018-11-07 14:59:24 +00:00
Kadir Cetinkaya f84a7d8d4f [clangd] [NFC] Fix clang-tidy warnings.
Reviewers: ioeric, sammccall, ilya-biryukov, hokein

Subscribers: MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54157

llvm-svn: 346308
2018-11-07 12:25:27 +00:00
Eric Liu b04869a4aa [clangd] Get rid of QueryScopes.empty() == AnyScope special case.
Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53933

llvm-svn: 346223
2018-11-06 11:08:17 +00:00
Eric Liu ad588af2d6 [clangd] auto-index stores symbols per-file instead of per-TU.
Summary:
This allows us to deduplicate header symbols across TUs. File digests
are collects when collecting symbols/refs. And the index store deduplicates
file symbols based on the file digest.

Reviewers: sammccall, hokein

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53433

llvm-svn: 346221
2018-11-06 10:55:21 +00:00
Kadir Cetinkaya 6675be8747 [clangd] Use thread pool for background indexing.
Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D53651

llvm-svn: 345590
2018-10-30 12:13:27 +00:00
Kadir Cetinkaya b915790385 [clangd] Do not query index for new name completions.
Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D53192

llvm-svn: 345153
2018-10-24 15:24:29 +00:00
Haojian Wu 40d5684d41 [clangd] Hide position line and column fields.
Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53577

llvm-svn: 345134
2018-10-24 12:56:41 +00:00
Sam McCall 668ac94ba4 [clangd] Truncate SymbolID to 16 bytes.
Summary:
The goal is 8 bytes, which has a nonzero risk of collisions with huge indexes.
This patch should shake out any issues with truncation at all, we can lower
further later.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53587

llvm-svn: 345113
2018-10-24 06:58:42 +00:00
Eric Liu 0b70a87480 [clangd] Support URISchemes configuration in BackgroundIndex.
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53503

llvm-svn: 344912
2018-10-22 15:37:58 +00:00
Sam McCall 45b2754097 [clangd] Fix unqualified make_unique after r344850. NFC
llvm-svn: 344858
2018-10-20 17:40:12 +00:00
Sam McCall c008af6466 [clangd] Namespace style cleanup in cpp files. NFC.
Standardize on the most common namespace setup in our *.cpp files:
  using namespace llvm;
  namespace clang {
  namespace clangd {
  void foo(StringRef) { ... }
And remove redundant llvm:: qualifiers. (Except for cases like
make_unique where this causes problems with std:: and ADL).

This choice is pretty arbitrary, but some broad consistency is nice.
This is going to conflict with everything. Sorry :-/

Squash the other configurations:

A)
  using namespace llvm;
  using namespace clang;
  using namespace clangd;
  void clangd::foo(StringRef);
This is in some of the older files. (It prevents accidentally defining a
new function instead of one in the header file, for what that's worth).

B)
  namespace clang {
  namespace clangd {
  void foo(llvm::StringRef) { ... }
This is fine, but in practice the using directive often gets added over time.

C)
  namespace clang {
  namespace clangd {
  using namespace llvm; // inside the namespace
This was pretty common, but is a bit misleading: name lookup preferrs
clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using
directive is).

llvm-svn: 344850
2018-10-20 15:30:37 +00:00
Simon Pilgrim ad28838111 Fix MSVC "not all control paths return a value" warning. NFCI.
llvm-svn: 344844
2018-10-20 13:18:49 +00:00
Haojian Wu 812b6c51c3 [clangd] Remove the overflow log.
Summary:
LLVM codebase has generated files (all are build/Target/XXX/*.inc) that
exceed the MaxLine & MaxColumn. Printing these log would be noisy.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53400

llvm-svn: 344777
2018-10-19 08:35:24 +00:00
Krasimir Georgiev 9035420091 [clangd] Fix msan failure after r344735 by initializing bitfields
That revision changed integer members to bitfields; the integers were
default initialized before and the bitfields lost that default
initialization. This started causing msan use-of-uninitialized memory in
clangd tests.

llvm-svn: 344773
2018-10-19 06:05:32 +00:00
Haojian Wu 6ece6e7dad [clangd] Clear the semantic of RefSlab::size.
Summary:
The RefSlab::size can easily cause confusions, it returns the number of
different symbols, rahter than the number of all references.

- add numRefs() method and cache it, since calculating it everytime is nontrivial.
- clear misused places.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53389

llvm-svn: 344745
2018-10-18 15:33:20 +00:00
Eric Liu 4859738cfe [clangd] Names that are not spelled in source code are reserved.
Summary:
These are often not expected to be used directly e.g.
```
TEST_F(Fixture, X) {
  ^  // "Fixture_X_Test" expanded in the macro should be down ranked.
}
```

Only doing this for sema for now, as such symbols are mostly coming from sema
e.g. gtest macros expanded in the main file. We could also add a similar field
for the index symbol.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53374

llvm-svn: 344736
2018-10-18 12:23:05 +00:00
Haojian Wu b515fabb3b [clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).

Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.

For LLVM-project binary index file, we save ~13% memory.

| Before | After |
| 412MB  | 355MB |

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53363

llvm-svn: 344735
2018-10-18 10:43:50 +00:00
Haojian Wu c014d863be [clangd] Fix buildbot failure.
llvm-svn: 344680
2018-10-17 08:54:48 +00:00
Haojian Wu 0404855529 [clangd] Print numbers of symbols and refs as well when loading the
index.

llvm-svn: 344679
2018-10-17 08:48:04 +00:00
Haojian Wu 7dd4950ea5 [clangd] Collect refs from headers.
Summary:
Add a flag to SymbolCollector to collect refs fdrom headers.

Note that we collect refs from headers in static index, and we don't do it for
dynamic index because of the preamble (we skip function body in preamble,
collecting it will result incomplete results).

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53322

llvm-svn: 344678
2018-10-17 08:38:36 +00:00
Sam McCall bca624ab03 [clangd] Fix threading bugs in (not-yet-used) BackgroundIndex, re-enable test.
Summary:
One relatively boring bug: forgot to notify the CV after enqueue.

One much more fun bug: the thread member could access instance variables before
they were initialized. Although the thread was last in the init list, QueueCV
etc were listed after Thread in the class, so their default constructors raced
with the thread itself.
We have to get very unlucky to lose this race, I saw it 0.02% of the time.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D53313

llvm-svn: 344595
2018-10-16 09:05:13 +00:00
Sam McCall 96f2489557 [clangd] Optionally use dex for the preamble parts of the dynamic index.
Summary:
Reuse the old -use-dex-index experiment flag for this.

To avoid breaking the tests, make Dex deduplicate symbols, addressing an old FIXME.

Reviewers: hokein

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53288

llvm-svn: 344594
2018-10-16 08:53:52 +00:00
Sam McCall bc8aee15a2 [clangd] Revert include path change in Dexp. NFC
llvm-svn: 344533
2018-10-15 16:47:45 +00:00
Haojian Wu 397704ca40 [clangd] Add createIndex in dexp
Summary:
This would allow easily injecting our internal customization.

Also updates the stale "symbol-collection-file" flag.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53292

llvm-svn: 344521
2018-10-15 15:12:40 +00:00
Sam McCall 2b24ce61a0 [clangd] Use SyncAPI in more places in tests. NFC
llvm-svn: 344520
2018-10-15 15:04:03 +00:00
Sam McCall 8dc9dbb61a [clangd] Minimal implementation of automatic static index (not enabled).
Summary:
See tinyurl.com/clangd-automatic-index for design and goals.

Lots of limitations to keep this patch smallish, TODOs everywhere:
 - no serialization to disk
 - no changes to dynamic index, which now has a much simpler job
 - no partitioning of symbols by file to avoid duplication of header symbols
 - no reindexing of edited files
 - only a single worker thread
 - compilation database is slurped synchronously (doesn't scale)
 - uses memindex, rebuilds after every file (should be dex, periodically)

It's not hooked up to ClangdServer/ClangdLSPServer yet: the layering
isn't clear (it should really be in ClangdServer, but ClangdLSPServer
has all the CDB interactions).

Reviewers: ioeric

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D53032

llvm-svn: 344513
2018-10-15 13:34:10 +00:00
Haojian Wu 82ba7121e8 [clangd] Remove an unused include header, NFC.
llvm-svn: 344510
2018-10-15 12:39:45 +00:00
Haojian Wu ddec850ceb [clangd] dump xrefs information in dexp tool.
Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53019

llvm-svn: 344508
2018-10-15 12:32:49 +00:00
Haojian Wu e83caccb58 [clangd] Fix some references missing in dynamic index.
Summary:
Previously, SymbolCollector postfilters all references at the end to
find all references of interesting symbols.
It was incorrect when indxing main AST where we don't see locations
of symbol declarations and definitions in the main AST (as those are in
preamble AST).

The fix is to do earily check during collecting references.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53273

llvm-svn: 344507
2018-10-15 11:46:26 +00:00
Jonas Toth f79f8eecce [clangd] NFC fix semicolon warning
llvm-svn: 344384
2018-10-12 17:47:43 +00:00
Haojian Wu 292a36a0d5 [clangd] Fix an accident change in r342999.
llvm-svn: 344054
2018-10-09 15:16:14 +00:00
Jonas Toth 3acdd020b4 [clangd] fix miscompiling lower_bound call
llvm-svn: 344044
2018-10-09 13:24:50 +00:00
Kirill Bobyrev 4a5ff88fdb [clangd] NFC: Migrate to LLVM STLExtras API where possible
This patch improves readability by migrating `std::function(ForwardIt
start, ForwardIt end, ...)` to LLVM's STLExtras range-based equivalent
`llvm::function(RangeT &&Range, ...)`.

Similar change in Clang: D52576.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52650

llvm-svn: 343937
2018-10-07 14:49:41 +00:00
Sam McCall 5fb9746c49 [clangd] Remove last usage of ast matchers from SymbolCollector. NFC
llvm-svn: 343849
2018-10-05 14:03:04 +00:00
Sam McCall 50b89f0a9b [clangd] Simplify Dex query tree logic and fix missing-posting-list bug
Summary:
The bug being fixed: when a posting list doesn't exist in the index, it
was previously just dropped from the query rather than being treated as
empty. Now that we have the FALSE iterator, we can use it instead.

The query tree logic previously had a bunch of special cases to detect whether
subtrees are empty. Now we just naively build the whole tree, and rely
on the query optimizations to drop the trivial parts.

Finally, there was a bug in trigram generation: the empty query would
generate a single trigram "$$$" instead of no trigrams.
This had no effect (there was no posting list, so the other bug
cancelled it out). But we now have to fix this bug too.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52796

llvm-svn: 343802
2018-10-04 17:18:55 +00:00
Sam McCall aa728f1afa [clangd] Dex: FALSE iterator, peephole optimizations, fix AND bug
Summary:
The FALSE iterator will be used in a followup patch to fix a logic bug in Dex
(currently, tokens that don't have posting lists in the index are simply dropped
from the query, changing semantics).

It can usually be optimized away, so added the following opmitizations:
 - simplify booleans inside AND/OR
 - replace effectively-empty AND/OR with booleans
 - flatten nested AND/ORs

While working on this, found a bug in the AND iterator: its constructor sync()
assumes that ReachedEnd is set if applicable, but the constructor never sets it.
This crashes if a non-first iterator is nonempty.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52789

llvm-svn: 343801
2018-10-04 17:18:49 +00:00
Sam McCall 422f724618 [clangd] expose MergedIndex class
Summary:
This allows inheriting from it, so index() can ga away and allowing
TestTU::index) to be fixed.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52250

llvm-svn: 343780
2018-10-04 14:20:22 +00:00
Sam McCall cc21779c3c [clangd] clangd-indexer gathers refs and stores them in index files.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52531

llvm-svn: 343778
2018-10-04 14:09:55 +00:00
Sam McCall 2ec5a10db3 [clangd] Remove one-segment-skipping from Dex trigrams.
Summary:
Currently queries like "ab" can match identifiers like a_yellow_bee.
The value of allowing this for exactly one segment but no more seems dubious.
It costs ~3% of overall ram (~9% of posting list ram) and some quality.

Reviewers: ilya-biryukov, ioeric

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52885

llvm-svn: 343777
2018-10-04 14:08:11 +00:00
Sam McCall b5bbfef6cd [cland] Dex: fix/simplify short-trigram generation
Summary:
1) Instead of x$$ for a short-query trigram, just use x
2) Make rules more coherent: prefixes of length 1-2, and first char + next head
3) Fix Dex::fuzzyFind to mark results as incomplete, because
   short-trigram rules only yield a subset of results.

Reviewers: ioeric

Subscribers: ilya-biryukov, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52808

llvm-svn: 343775
2018-10-04 14:01:55 +00:00
Sam McCall 87f69eaf4e [clangd] Dex: FALSE iterator, peephole optimizations, fix AND bug
Summary:
The FALSE iterator will be used in a followup patch to fix a logic bug in Dex
(currently, tokens that don't have posting lists in the index are simply dropped
from the query, changing semantics).

It can usually be optimized away, so added the following opmitizations:
 - simplify booleans inside AND/OR
 - replace effectively-empty AND/OR with booleans
 - flatten nested AND/ORs

While working on this, found a bug in the AND iterator: its constructor sync()
assumes that ReachedEnd is set if applicable, but the constructor never sets it.
This crashes if a non-first iterator is nonempty.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52789

llvm-svn: 343774
2018-10-04 13:12:23 +00:00
Sam McCall d9eae39800 [clangd] Support refs() in dex. Largely cloned from MemIndex.
Reviewers: hokein

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52726

llvm-svn: 343760
2018-10-04 09:16:12 +00:00
Sam McCall 41e6d76c22 [clangd] clangd-indexer: Drop support for MR-via-YAML
Summary:
It's slow, and the open-source reduce implementation doesn't scale properly.
While here, tidy up some dead headers and comments.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D52517

llvm-svn: 343759
2018-10-04 08:30:03 +00:00
Sam McCall a659d779f8 Reland r343589 "[clangd] Dex: add Corpus factory for iterators, rename, fold constant. NFC""
This reverts commit r343610.

llvm-svn: 343622
2018-10-02 19:59:23 +00:00
Reid Kleckner 2b5259afb3 Revert r343589 "[clangd] Dex: add Corpus factory for iterators, rename, fold constant. NFC"
Declaring a field with the same name as a type causes GCC to error out:

Dex.h:104:10: error: declaration of 'clang::clangd::dex::Corpus clang::clangd::dex::Dex::Corpus' [-fpermissive]
   Corpus Corpus;
          ^
Iterator.h:127:7: error: changes meaning of 'Corpus' from 'class clang::clangd::dex::Corpus' [-fpermissive]
 class Corpus {

llvm-svn: 343610
2018-10-02 17:31:43 +00:00
Sam McCall 51be55d0ec [clangd] Zap TODONEs
llvm-svn: 343590
2018-10-02 13:51:43 +00:00
Sam McCall a1e7385d5c [clangd] Dex: add Corpus factory for iterators, rename, fold constant. NFC
Summary:
- Corpus avoids having to pass size to the true iterator, and (soon) any
  iterator that might optimize down to true.
- Shorten names of factory functions now they're scoped to the Corpus.
  intersect() and unionOf() rather than createAnd() or createOr() as this
  seems to read better to me, and fits with other short names. Opinion wanted!
- DEFAULT_BOOST_SCORE --> 1. This is a multiplier, don't obfuscate identity.
- Simplify variadic templates in Iterator.h

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52711

llvm-svn: 343589
2018-10-02 13:44:26 +00:00
Sam McCall 7402836042 [clangd] Dex iterator printer shows query structure, not iterator state.
Summary:
This makes it suitable for logging (which immediately found a bug, to
be fixed in the next patch...)

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52715

llvm-svn: 343580
2018-10-02 11:51:36 +00:00
Sam McCall 329fc143fd [clangd] Query dex index using query-style trigrams, not identifier-style trigrams
llvm-svn: 343453
2018-10-01 10:42:51 +00:00
Eric Liu d5d6a60a78 [clangd] Fix header mapping for std::string. NFC
Some implementation has std::string declared in <iosfwd>.

llvm-svn: 343448
2018-10-01 08:50:49 +00:00
Eric Liu 670c147d83 [clangd] Initial supoprt for cross-namespace global code completion.
Summary:
When no scope qualifier is specified, allow completing index symbols
from any scope and insert proper automatically. This is still experimental and
hidden behind a flag.

Things missing:
- Scope proximity based scoring.
- FuzzyFind supports weighted scopes.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: kbobyrev, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52364

llvm-svn: 343248
2018-09-27 18:46:00 +00:00
Eric Liu ee7fe93fa8 [clangd] Add more tracing to index queries. NFC
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52611

llvm-svn: 343247
2018-09-27 18:23:23 +00:00
Kirill Bobyrev ea4f20c6be [clangd] Fix bugs with incorrect memory estimate report
* With the current implementation, `sizeof(std::vector<Chunk>)` is added
twice to the `Dex` memory estimate which is incorrect
* `Dex` logs memory usage estimation before `BackingDataSize` is set and
hence the log report excludes size of the external `SymbolSlab` which is
coupled with `Dex` instance

Reviewed By: ioeric

Differential Revision: https://reviews.llvm.org/D52503

llvm-svn: 343117
2018-09-26 15:06:23 +00:00
Kirill Bobyrev 0cdf629394 [docs] Update PostingList string representation format
Because `PostingList` objects are compressed, it is now impossible to
see elements other than the current one and the documentation doesn't
match implementation anymore.

Reviewed By: ioeric

Differential Revision: https://reviews.llvm.org/D52545

llvm-svn: 343116
2018-09-26 14:59:49 +00:00
Simon Pilgrim 3462e76ba5 Removed extra semicolon to fix Wpedantic. (NFCI).
llvm-svn: 343083
2018-09-26 09:02:45 +00:00
Sam McCall 321d5d4802 [clangd] Extract mapper logic from clangd-indexer into a library.
Summary: Soon we can drop support for MR-via-YAML.
I need to modify some out-of-tree versions to use the library, first.

Reviewers: kadircet

Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D52465

llvm-svn: 343019
2018-09-25 20:02:36 +00:00
Sam McCall e38c7f8d96 [clangd] Fix reversed RIFF/YAML serialization
llvm-svn: 343017
2018-09-25 19:53:33 +00:00
Sam McCall 02d600d267 [clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
 - symbol -> YAML I expect to keep for tools like dexp
 - YAML -> symbol is used for the MR-style indexer, I think we can eliminate
   this (merge-on-the-fly, else use a different serialization)

Reviewers: kbobyrev

Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52453

llvm-svn: 342999
2018-09-25 18:06:43 +00:00
Kirill Bobyrev d041f8a9d0 [clangd] NFC: Simplify code, enforce LLVM Coding Standards
For consistency, functional-style code pieces are replaced with their
simple counterparts to improve readability.

Also, file headers are fixed to comply with LLVM Coding Standards.

`static` member of anonymous namespace is not marked `static` anymore,
because it is redundant.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52466

llvm-svn: 342974
2018-09-25 13:58:48 +00:00
Kirill Bobyrev 69e6388564 [clangd] Fix some buildbots after r342965
Some compilers fail to parse struct default member initializer.

llvm-svn: 342970
2018-09-25 13:14:11 +00:00
Kirill Bobyrev 6c2f5bd0f1 [clangd] Implement VByte PostingList compression
This patch implements Variable-length Byte compression of `PostingList`s
to sacrifice some performance for lower memory consumption.

`PostingList` compression and decompression was extensively tested using
fuzzer for multiple hours and runnning significant number of realistic
`FuzzyFindRequests`. AddressSanitizer and UndefinedBehaviorSanitizer
were used to ensure the correct behaviour.

Performance evaluation was conducted with recent LLVM symbol index (292k
symbols) and the collection of user-recorded queries (7751
`FuzzyFindRequest` JSON dumps):

| Metrics | Before| After | Change (%)
| -----  | -----  | -----   | -----
| Memory consumption (posting lists only), MB  |  54.4 | 23.5 | -60%
| Time to process queries, sec | 7.70 | 9.4 | +25%

Reviewers: sammccall, ioeric

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52300

llvm-svn: 342965
2018-09-25 11:54:51 +00:00
Sam McCall 3ca9759a21 [clangd] Fix uninit bool in r342888
llvm-svn: 342903
2018-09-24 16:52:48 +00:00
Sam McCall 8fb7bb2482 [clangd] Do bounds checks while reading data, otherwise var-length records are too painful. NFC
llvm-svn: 342888
2018-09-24 14:51:15 +00:00
Kirill Bobyrev 94af0612e0 [clangd] Force Dex to respect symbol collector flags
`Dex` should utilize `FuzzyFindRequest.RestrictForCodeCompletion` flags
and omit symbols not meant for code completion when asked for it.

The measurements below were conducted with setting
`FuzzyFindRequest.RestrictForCodeCompletion` to `true` (so that it's
more realistic). Sadly, the average latency goes down, I suspect that is
mostly because of the empty queries where the number of posting lists is
critical.

| Metrics  | Before | After | Relative difference
| -----  | -----  | -----   | -----
| Cumulative query latency (7000 `FuzzyFindRequest`s over LLVM static index)  | 6182735043 ns    | 7202442053 ns | +16%
| Whole Index size | 81.24 MB    | 81.79 MB | +0.6%

Out of 292252 symbols collected from LLVM codebase 136926 appear to be
restricted for code completion.

Reviewers: ioeric

Differential Revision: https://reviews.llvm.org/D52357

llvm-svn: 342866
2018-09-24 08:45:18 +00:00
Eric Liu c275fb2a5d [clangd] Remember to serialize symbol origin in YAML.
llvm-svn: 342730
2018-09-21 13:04:57 +00:00
Eric Liu 467c5f9ce0 [clangd] Store preamble macros in dynamic index.
Summary:
Pros:
o Loading macros from preamble for every completion is slow (see profile).
o Calculating macro USR is also slow (see profile).
o Sema can provide a lot of macro completion results (e.g. when filter is empty,
60k for some large TUs!).

Cons:
o Slight memory increase in dynamic index (~1%).
o Some extra work during preamble build (should be fine as preamble build and
indexAST is way slower).

Before:
{F7195645}

After:
{F7195646}

Reviewers: ilya-biryukov, sammccall

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52078

llvm-svn: 342529
2018-09-19 09:35:04 +00:00
Sam McCall 46b5555844 [clangd] Fix error handling for SymbolID parsing (notably YAML and dexp)
llvm-svn: 342505
2018-09-18 19:00:59 +00:00
Eric Liu 764f461f9c [clangd] Get rid of Decls parameter in indexMainDecls. NFC
It's already available in ParsedAST.

llvm-svn: 342473
2018-09-18 13:35:16 +00:00
Eric Liu 821a116818 [clangd] Merge ClangdServer::DynamicIndex into FileIndex. NFC.
Summary:
FileIndex now provides explicit interfaces for preamble and main file updates.
This avoids growing parameter list when preamble and main symbols diverge
further (e.g. D52078). This also gets rid of the hack in `indexAST` that
inferred main file index based on `TopLevelDecls`.

Also separate `indexMainDecls` from `indexAST`.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52222

llvm-svn: 342460
2018-09-18 10:30:44 +00:00
Sam McCall 3bf9b6d920 [clangd] dexp tool uses llvm::cl to parse its flags.
Summary:
We can use cl::ResetCommandLineParser() to support different types of
command-lines, as long as we're careful about option lifetimes.
(I tried using subcommands, but the error messages were bad)
I found a mostly-reasonable pattern to isolate the fiddly parts.

Added -scope and -limit flags to the `find` command to demonstrate.
(Note that scope support seems to be broken in dex?)

Fixed symbol lookup to parse symbol IDs.

Caveats:
 - with command help (e.g. `find -help`), you also get some spam
   about required arguments. This is a bug in llvm::cl, which prints
   these to errs() rather than the designated stream.

Reviewers: kbobyrev

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51989

llvm-svn: 342456
2018-09-18 09:49:57 +00:00
Eric Liu f736766659 [clangd] Adapt API change after 342451.
llvm-svn: 342452
2018-09-18 08:52:14 +00:00
Eric Liu a57afd091f [clangd] Get rid of AST matchers in SymbolCollector. NFC
Reviewers: ilya-biryukov, kadircet

Subscribers: MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D52089

llvm-svn: 342362
2018-09-17 07:43:49 +00:00
Kirill Bobyrev 249c5864cf [clangd] Introduce PostingList interface
This patch abstracts `PostingList` interface and reuses existing
implementation. It will be used later to test different `PostingList`
representations.

No functionality change is introduced, this patch is mostly refactoring
so that the following patches could focus on functionality while not
being too hard to review.

Reviewed By: sammccall, ioeric

Differential Revision: https://reviews.llvm.org/D51982

llvm-svn: 342155
2018-09-13 17:11:03 +00:00
Kirill Bobyrev bd72b08eb3 [clangd] Fix Dexp build
%s/MaxCandidateCount/Limit/g after rL342138.

llvm-svn: 342143
2018-09-13 15:35:55 +00:00
Kirill Bobyrev e6dd0806c7 [clangd] Cleanup FuzzyFindRequest filtering limit semantics
As discussed during D51860 review, it is better to use `llvm::Optional`
here as it has clear semantics which reflect intended behavior.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52028

llvm-svn: 342138
2018-09-13 14:27:03 +00:00
Kirill Bobyrev d9f33b129c [clangd] Don't create child AND and OR iterators with one posting list
`AND( AND( Child ) ... )` -> `AND( Child ... )`
`AND( OR( Child ) ... )` -> `AND( Child ... )`

This simple optimization results in 5-6% performance improvement in the
benchmark with 2000 serialized `FuzzyFindRequest`s.

Reviewed By: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D52016

llvm-svn: 342124
2018-09-13 10:02:48 +00:00
Heejin Ahn 386d272387 [clangd] Add missing clangBasic target_link_libraries
Without this, builds with `-DSHARED_LIB=ON` fail.

llvm-svn: 342037
2018-09-12 09:40:13 +00:00
Kirill Bobyrev e1e19c7b75 [clangd] Implement a Proof-of-Concept tool for symbol index exploration
Reviewed By: sammccall, ilya-biryukov

Differential Revision: https://reviews.llvm.org/D51628

llvm-svn: 342025
2018-09-12 07:32:54 +00:00
Kirill Bobyrev 0dee397e06 [clangd] NFC: Use uint32_t for FuzzyFindRequest limits
Reviewed By: ioeric

Differential Revision: https://reviews.llvm.org/D51860

llvm-svn: 341921
2018-09-11 10:31:38 +00:00
Kirill Bobyrev 5faf8a3d84 [clangd] Unbreak buildbots after r341802
Solution: use std::move when returning result from toJSON(...).
llvm-svn: 341832
2018-09-10 14:31:38 +00:00
Kirill Bobyrev 09f00dcf69 [clangd] Implement FuzzyFindRequest JSON (de)serialization
JSON (de)serialization of `FuzzyFindRequest` might be useful for both
D51090 and D51628. Also, this allows precise logging of the fuzzy find
requests.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51852

llvm-svn: 341802
2018-09-10 11:51:05 +00:00
Kirill Bobyrev 38a889c185 [clangd] Add symbol slab size to index memory consumption estimates
Currently, `SymbolIndex::estimateMemoryUsage()` returns the "overhead"
estimate, i.e. the estimate of the Index data structure excluding
backing data (such as Symbol Slab and Reference Slab). This patch
propagates information about paired data size where necessary.

Reviewed By: ioeric, sammccall

Differential Revision: https://reviews.llvm.org/D51539

llvm-svn: 341800
2018-09-10 11:46:07 +00:00
Kirill Bobyrev 5abe478a3d [clangd] NFC: Rename DexIndex to Dex
Also, cleanup some redundant includes.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51774

llvm-svn: 341784
2018-09-10 08:23:53 +00:00
Kirill Bobyrev 59491a1fa9 [clangd] Make advanceTo() faster on Posting Lists
If the current element is already beyond advanceTo()'s DocID, just
return instead of doing binary search. This simple optimization saves up
to 6-7% performance,

Reviewed By: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D51802

llvm-svn: 341781
2018-09-10 07:57:28 +00:00
Eric Liu f76886859f [clangd] Canonicalize include paths in clangd.
Get rid of "../"  and "../../".

llvm-svn: 341645
2018-09-07 09:40:36 +00:00
Eric Liu 6df66001ee [clangd] Add "Deprecated" field to Symbol and CodeCompletion.
Summary: Also set "deprecated" field in LSP CompletionItem.

Reviewers: sammccall, kadircet

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D51724

llvm-svn: 341576
2018-09-06 18:52:26 +00:00
Kirill Bobyrev 049b2d4345 [clangd] Fix Dex initialization
This patch sets URI schemes of Dex to SymbolCollector's default schemes
in case callers tried to pass empty list of schemes. This was the case
for initialization in Clangd main and was a reason of incorrect
behavior.

Also, it fixes a bug with missed `continue;` after spotting invalid URI
scheme conversion.

llvm-svn: 341552
2018-09-06 15:10:10 +00:00
Kirill Bobyrev afbf31854d [clangd] NFC: Use TopN instead of std::priority_queue
Quality.cpp defines a structure for convenient storage of Top N items,
it should be used instead of the `std::priority_queue` with slightly
obscure semantics.

This patch does not affect functionality.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51676

llvm-svn: 341544
2018-09-06 13:15:03 +00:00
Kirill Bobyrev e4ee0213d4 [clangd] NFC: mark single-parameter constructors explicit
Code health: prevent implicit conversions to user-defined types.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51690

llvm-svn: 341543
2018-09-06 13:06:04 +00:00
Kirill Bobyrev 19a9461e5f [clangd] Implement proximity path boosting for Dex
This patch introduces `PathURI` Search Token kind and utilizes it to
uprank symbols which are defined in files with small distance to the
directory where the fuzzy find request is coming from (e.g. files user
is editing).

Reviewed By: ioeric

Reviewers: ioeric, sammccall

Differential Revision: https://reviews.llvm.org/D51481

llvm-svn: 341542
2018-09-06 12:54:43 +00:00
Eric Liu d25f1214a8 [clangd] Set SymbolID for sema macros so that they can be merged with index macros.
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51688

llvm-svn: 341534
2018-09-06 09:59:37 +00:00
Sam McCall e4fa7b8418 [clangd] make zlib compression optional for binary format
llvm-svn: 341465
2018-09-05 13:17:47 +00:00
Sam McCall d85264bf53 [clangd] Fix buildbot failures on older compilers from r341375
llvm-svn: 341451
2018-09-05 07:52:49 +00:00
Sam McCall 76c4c3af52 [clangd] Load static index asynchronously, add tracing.
Summary:
Like D51475 but simplified based on recent patches.
While here, clarify that loadIndex() takes a filename, not file content.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51638

llvm-svn: 341376
2018-09-04 16:19:40 +00:00
Sam McCall 50f3631057 [clangd] Define a compact binary serialization fomat for symbol slab/index.
Summary:
This is intended to replace the current YAML format for general use.
It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML:
  llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M
It's also simpler/faster to read and write.

The format is a RIFF container (chunks of (type, size, data)) with:
 - a compressed string table
 - simple binary encoding of symbols (with varints for compactness)
It can be extended to include occurrences, Dex posting lists, etc.

There's no rich backwards-compatibility scheme, but a version number is included
so we can detect incompatible files and do ad-hoc back-compat.

Alternatives considered:
 - compressed YAML or JSON: bulky and slow to load
 - llvm bitstream: confusing model and libraries are hard to use. My attempt
   produced slightly larger files, and the code was longer and slower.
 - protobuf or similar: would be really nice (esp for back-compat) but the
   dependency is a big hassle
 - ad-hoc binary format without a container: it seems clear we're going
   to add posting lists and occurrences here, and that they will benefit
   from sharing a string table. The container makes it easy to debug
   these pieces in isolation, and make them optional.

Reviewers: ioeric

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51585

llvm-svn: 341375
2018-09-04 16:16:50 +00:00
Kirill Bobyrev cc8b507a60 [clangd] NFC: Change quality type to float
Reviewed by: sammccall

Differential Revision: https://reviews.llvm.org/D51636

llvm-svn: 341374
2018-09-04 15:45:56 +00:00
Kirill Bobyrev d5bc65444c [clangd] Move buildStaticIndex() to SymbolYAML
`buildStaticIndex()` is used by two other tools that I'm building, now
it's useful outside of `tool/ClangdMain.cpp`.

Also, slightly refactor the code while moving it to the different source
file.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51626

llvm-svn: 341369
2018-09-04 15:10:40 +00:00
Sam McCall b0138317d6 [clangd] SymbolOccurrences -> Refs and cleanup
Summary:
A few things that I noticed while merging the SwapIndex patch:
 - SymbolOccurrences and particularly SymbolOccurrenceSlab are unwieldy names,
   and these names appear *a lot*. Ref, RefSlab, etc seem clear enough
   and read/format much better.
 - The asymmetry between SymbolSlab and RefSlab (build() vs freeze()) is
   confusing and irritating, and doesn't even save much code.
   Avoiding RefSlab::Builder was my idea, but it was a bad one; add it.
 - DenseMap<SymbolID, ArrayRef<Ref>> seems like a reasonable compromise for
   constructing MemIndex - and means many less wasted allocations than the
   current DenseMap<SymbolID, vector<Ref*>> for FileIndex, and none for
   slabs.
 - RefSlab::find() is not actually used for anything, so we can throw
   away the DenseMap and keep the representation much more compact.
 - A few naming/consistency fixes: e.g. Slabs,Refs -> Symbols,Refs.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51605

llvm-svn: 341368
2018-09-04 14:39:56 +00:00
Sam McCall dd4a24c86c [clangd] Fix index-twice regression from r341242
llvm-svn: 341337
2018-09-03 20:26:26 +00:00
Sam McCall 9c7624e14b [clangd] Factor out the data-swapping functionality from MemIndex/DexIndex.
Summary:
This is now handled by a wrapper class SwapIndex, so MemIndex/DexIndex can be
immutable and focus on their job.

Old and busted:
 I have a MemIndex, which holds a shared_ptr<vector<Symbol*>>, which keeps the
 symbol slab alive. I update by calling build(shared_ptr<vector<Symbol*>>).

New hotness: I have a SwapIndex, which holds a unique_ptr<SymbolIndex>, which
 holds a MemIndex, which holds a shared_ptr<void>, which keeps backing
 data alive.
 I update by building a new MemIndex and calling SwapIndex::reset().

Reviewers: kbobyrev, ioeric

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51422

llvm-svn: 341318
2018-09-03 14:37:43 +00:00
Eric Liu 83f63e42b2 [clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.

In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51291

llvm-svn: 341304
2018-09-03 10:18:21 +00:00