Commit Graph

437 Commits

Author SHA1 Message Date
Ilya Biryukov 4e0c400a47 [clangd] Fix crash due to ObjCPropertyDecl
With ObjCPropertyDecl, ASTNode.OrigD can be a ObjCPropertyImplDecl
which is not a NamedDecl, leading to a crash since the code
incorrectly assumes ASTNode.OrigD will always be a NamedDecl.

Change by dgoldman (David Goldman)!

Differential Revision: https://reviews.llvm.org/D56916

llvm-svn: 351941
2019-01-23 10:35:12 +00:00
Haojian Wu 72ef4510b6 [clangd] Fix the `-Wtype-limits` warning, NFC
The assertion is always true, and triggers a compiler warning, so remove it.

llvm-svn: 351809
2019-01-22 12:21:25 +00:00
Kadir Cetinkaya be6b35dac4 [clangd] Filter out plugin related flags and move all commandline manipulations into OverlayCDB.
Summary:
Some projects make use of clang plugins when building, but clangd is
not aware of those plugins therefore can't work with the same compile command
arguments.

There were multiple places clangd performed commandline manipulations,
 this one also moves them all into OverlayCDB.

Reviewers: ilya-biryukov

Subscribers: klimek, sammccall, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D56841

llvm-svn: 351788
2019-01-22 09:10:20 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Ilya Biryukov 1c48f0383c [clangd] Make background index less chatty
Summary:
It is producing too much input in non-verbose mode,
i.e. a message per indexed file

Reviewers: sammccall, kadircet

Reviewed By: sammccall

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56915

llvm-svn: 351563
2019-01-18 17:04:26 +00:00
Kadir Cetinkaya 226af75a02 [clangd] Fix updated file detection logic in indexing
Summary:
Files without any symbols were never marked as updated during indexing, which resulted in failure while writing shards for these files.

This patch fixes the logic to mark files that are seen for the first time but don't contain any symbols as updated.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D56592

llvm-svn: 351170
2019-01-15 09:03:33 +00:00
Haojian Wu c34f022bfe [clangd] Add Limit parameter for xref.
Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56597

llvm-svn: 351081
2019-01-14 18:11:09 +00:00
Kadir Cetinkaya 560b853ccf [clangd] Fix a reference invalidation
Summary: Fix for the breakage in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52811/consoleFull#-42777206a1ca8a51-895e-46c6-af87-ce24fa4cd561

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D56656

llvm-svn: 351052
2019-01-14 11:24:07 +00:00
Sam McCall 0e93b076c4 [clangd] Index main-file symbols (bug 39761)
Patch by Nathan Ridge!

Differential Revision: https://reviews.llvm.org/D55185

llvm-svn: 351041
2019-01-14 10:01:17 +00:00
Kadir Cetinkaya 99b060e447 [clangd] Introduce loading of shards within auto-index
Summary:
Whenever a change happens on a CDB, load shards associated with that
CDB before issuing re-index actions.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55224

llvm-svn: 350847
2019-01-10 17:03:04 +00:00
Haojian Wu 8f85b9f867 [clangd] Don't store completion info if the symbol is not used for code completion.
Summary:
This would save us some memory and disk space:
  - Dex usage (261 MB vs 266 MB)
  - Disk (75 MB vs 76 MB)

It would save more when we index the main file symbol D55185.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: nridge, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56314

llvm-svn: 350803
2019-01-10 09:22:40 +00:00
Haojian Wu 073d184ee3 [clangd] Fix a crash when reading an empty index file.
Summary:
Unfortunately, yaml::Input::setCurrentDocument() and yaml::Input::nextDocument() are
internal APIs, the way we use them may cause a nullptr accessing when
processing an empty YAML file.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D56442

llvm-svn: 350633
2019-01-08 15:24:47 +00:00
Ilya Biryukov f2001aa743 [clangd] Remove 'using namespace llvm' from .cpp files. NFC
The new guideline is to qualify with 'llvm::' explicitly both in
'.h' and '.cpp' files. This simplifies moving the code between
header and source files and is easier to keep consistent.

llvm-svn: 350531
2019-01-07 15:45:19 +00:00
Haojian Wu b2d7e269d5 [clangd] Don't miss the expected type in merge.
Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D55918

llvm-svn: 349750
2018-12-20 13:05:46 +00:00
Kadir Cetinkaya dd67793c0c [clangd] Unify path canonicalizations in the codebase
Summary:
There were a few different places where we canonicalized paths, each
one had its own flavor. This patch tries to unify them all under one place.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55818

llvm-svn: 349618
2018-12-19 10:46:21 +00:00
Eric Liu 667e8ef7e1 [clangd] BackgroundIndex rebuilds symbol index periodically.
Summary:
Currently, background index rebuilds symbol index on every indexed file,
which can be inefficient. This patch makes it only rebuild symbol index periodically.
As the rebuild no longer happens too often, we could also build more efficient
dex index.

Reviewers: ilya-biryukov, kadircet

Reviewed By: kadircet

Subscribers: dblaikie, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D55770

llvm-svn: 349496
2018-12-18 15:39:33 +00:00
Kadir Cetinkaya e913b956aa [clangd] Change diskbackedstorage to be atomic
Summary:
There was a chance that multiple clangd instances could try to write
same shard, in which case we would get a malformed file most likely. This patch
changes the writing mechanism to first write to a temporary file and then rename
it to fit real destination. Which is guaranteed to be atomic by POSIX.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D55417

llvm-svn: 349348
2018-12-17 12:38:22 +00:00
Kadir Cetinkaya 375c54fd1e [clangd] Only reduce priority of a thread for indexing.
Summary:
We'll soon have tasks pending for reading shards from disk, we want
them to have normal priority. Because:
- They are not CPU intensive, mostly IO bound.
- Give a good coverage for the project at startup, therefore it is worth
  spending some cycles.
- We have only one task per whole CDB rather than one task per file.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D55315

llvm-svn: 349345
2018-12-17 12:30:27 +00:00
Kadir Cetinkaya 1b65b376ae [dexp] Change FuzzyFind to also print scope of symbols
Summary:
When there are multiple symbols in the result of a fuzzy find with the
same name, one has to perform an additional query to figure out which of those
symbols are coming from the "interesting" scope. This patch prints the scope in
fuzzy find results to get rid of the second symbol.

Reviewers: hokein

Subscribers: ilya-biryukov, ioeric, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55705

llvm-svn: 349152
2018-12-14 14:17:18 +00:00
Haojian Wu d5a78e6e59 [clangd] Fix an assertion failure in background index.
Summary:
When indexing a file which contains an uncompilable error, we will
trigger an assertion failure -- the IndexFileIn data is not set, but we
access them in the backgound index.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55650

llvm-svn: 349144
2018-12-14 12:39:08 +00:00
Haojian Wu 9d0d9f884c [clangd] Move the utility function to anonymous namespace, NFC.
llvm-svn: 349031
2018-12-13 13:07:29 +00:00
Kadir Cetinkaya 219c0fae5c [clangd] Partition include graph on auto-index.
Summary:
Partitions include graphs in auto-index so that each shards contains
only part of the include graph related to itself.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D55062

llvm-svn: 348252
2018-12-04 11:31:57 +00:00
Haojian Wu 7800dbe157 [clangd] Fix a stale comment, NFC.
llvm-svn: 348133
2018-12-03 13:16:04 +00:00
Kadir Cetinkaya 5399552da1 [clangd] Populate include graph during static indexing action.
Summary:
This is the second part for introducing include hierarchy into index
files produced by clangd. You can see the base patch that introduces structures
and discusses the future of the patches in D54817

Reviewers: ilya-biryukov

Subscribers: mgorny, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54999

llvm-svn: 348005
2018-11-30 16:59:00 +00:00
Jan Korous 6089b6192e [clangd][NFC] Move SymbolID to a separate file
Prerequisity for textDocument/SymbolInfo

Differential Revision: https://reviews.llvm.org/D54799

llvm-svn: 347674
2018-11-27 16:40:34 +00:00
Kadir Cetinkaya d08eab4281 [clangd] Put direct headers into srcs section.
Summary:
Currently, there's no way of knowing about header files
using compilation database, since it doesn't contain header files as entries.

Using this information, restoring from cache using compile commands becomes
possible instead of doing directory traversal. Also, we can issue indexing
actions for out-of-date headers even if source files depending on them haven't
changed.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54817

llvm-svn: 347669
2018-11-27 16:08:53 +00:00
Sam McCall 422c828dfc [clangd] Enable auto-index behind a flag.
Summary:
Ownership and configuration:
The auto-index (background index) is maintained by ClangdServer, like Dynamic.
(This means ClangdServer will be able to enqueue preamble indexing in future).
For now it's enabled by a simple boolean flag in ClangdServer::Options, but
we probably want to eventually allow injecting the storage strategy.

New 'sync' command:
In order to meaningfully test the integration (not just unit-test components)
we need a way for tests to ensure the asynchronous index reads/writes occur
before a certain point.
Because these tests and assertions are few, I think exposing an explicit "sync"
command for use in tests is simpler than allowing threading to be completely
disabled in the background index (as we do for TUScheduler).

Bugs:
I fixed a couple of trivial bugs I found while testing, but there's one I can't.
JSONCompilationDatabase::getAllFiles() may return relative paths, and currently
we trigger an assertion that assumes they are absolute.
There's no efficient way to resolve them (you have to retrieve the corresponding
command and then resolve against its directory property). In general I think
this behavior is broken and we should fix it in JSONCompilationDatabase and
require CompilationDatabase::getAllFiles() to be absolute.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54894

llvm-svn: 347567
2018-11-26 16:00:11 +00:00
Ilya Biryukov 4d3d82eef9 [clangd] Fix use-after-free with expected types in indexing
llvm-svn: 347563
2018-11-26 15:52:16 +00:00
Ilya Biryukov 647da3e8a5 [clangd] Add type boosting in code completion
Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52276

llvm-svn: 347562
2018-11-26 15:38:01 +00:00
Ilya Biryukov a21392bfc7 [clangd] Collect and store expected types in the index
Summary:
And add a hidden option to control whether the types are collected.
For experiments, will be removed when expected types implementation
is stabilized.

The index size is almost unchanged, e.g. the YAML index for all clangd
sources increased from 53MB to 54MB.

Reviewers: ioeric, sammccall

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52274

llvm-svn: 347560
2018-11-26 15:29:14 +00:00
Sam McCall 7d0e4848ad [clangd] Fix missing include from r347538 - fix windows buildbots
llvm-svn: 347554
2018-11-26 13:35:02 +00:00
Sam McCall 6e2d2a33b6 [clangd] Auto-index watches global CDB for changes.
Summary:
Instead of receiving compilation commands, auto-index is triggered by just
filenames to reindex, and gets commands from the global comp DB internally.
This has advantages:
 - more of the work can be done asynchronously (fetching compilation commands
   upfront can be slow for large CDBs)
 - we get access to the CDB which can be used to retrieve interpolated commands
   for headers (useful in some cases where the original TU goes away)
 - fits nicely with the filename-only change observation from r347297

The interface to GlobalCompilationDatabase gets extended: when retrieving a
compile command, the GCDB can optionally report the project the file belongs to.
This naturally fits together with getCompileCommand: it's hard to implement one
without the other. But because most callers don't care, I've ended up with an
awkward optional-out-param-in-virtual method pattern - maybe there's a better
one.

This is the main missing integration point between ClangdServer and
BackgroundIndex, after this we should be able to add an auto-index flag.

Reviewers: ioeric, kadircet

Subscribers: MaskRay, jkorous, arphaman, cfe-commits, ilya-biryukov

Differential Revision: https://reviews.llvm.org/D54865

llvm-svn: 347538
2018-11-26 09:51:50 +00:00
Eric Liu c0ac4bb17c [clangd] Cleanup: stop passing around list of supported URI schemes.
Summary:
Instead of passing around a list of supported URI schemes in clangd, we
expose an interface to convert a path to URI using any compatible scheme
that has been registered. It favors customized schemes and falls
back to "file" when no other scheme works.

Changes in this patch are:
- URI::create(AbsPath, URISchemes) -> URI::create(AbsPath). The new API finds a
compatible scheme from the registry.
- Remove URISchemes option everywhere (ClangdServer, SymbolCollecter, FileIndex etc).
- Unit tests will use "unittest" by default.
- Move "test" scheme from ClangdLSPServer to ClangdMain.cpp, and only
register the test scheme when lit-test or enable-lit-scheme is set.
(The new flag is added to make lit protocol.test work; I wonder if there
is alternative here.)

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54800

llvm-svn: 347467
2018-11-22 15:02:05 +00:00
Kadir Cetinkaya dd91a36422 Address comments.
llvm-svn: 347237
2018-11-19 18:06:36 +00:00
Kadir Cetinkaya 244ac0dba0 Use digest size instead of hardcoding it.
llvm-svn: 347236
2018-11-19 18:06:33 +00:00
Kadir Cetinkaya ca9e5dc714 [clangd] Store source file hash in IndexFile{In,Out}
Summary:
Puts the digest of the source file that generated the index into
serialized index and stores them back on load, if exists.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54693

llvm-svn: 347235
2018-11-19 18:06:29 +00:00
Haojian Wu 22c9f7b296 [clangd] Truncate SymbolID to 8 bytes.
Summary:
This is our goal. It has a non-zero rick, but so far we haven't see any
collision (externally and internally).

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54622

llvm-svn: 347044
2018-11-16 10:58:40 +00:00
Haojian Wu 1bf52c59b7 [clangd] Fix a compiler warning and test crashes caused in rL347038.
llvm-svn: 347039
2018-11-16 09:41:14 +00:00
Kadir Cetinkaya 06553bfe96 Introduce shard storage to auto-index.
Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: llvm-commits, mgorny, Eugene.Zelenko, ilya-biryukov, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54269

llvm-svn: 347038
2018-11-16 09:03:56 +00:00
Haojian Wu fd4d45514f [clangd] global-symbol-builder => clangd-indexer
llvm-svn: 346955
2018-11-15 14:15:19 +00:00
Haojian Wu 5e7486f518 [clangd] Fix no results returned for global symbols in dexp
Summary:
For symbols in global namespace (without any scope), we need to
add global scope "" to the fuzzy request.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54519

llvm-svn: 346947
2018-11-15 12:17:41 +00:00
Kadir Cetinkaya 5a9b92ca75 Revert "Introduce shard storage to auto-index."
This reverts commit 6dd1f24aead10a8d375d0311001987198d26e900.

llvm-svn: 346945
2018-11-15 10:34:47 +00:00
Kadir Cetinkaya bd2441c887 Revert "clang-format"
This reverts commit 0a37e9c3d88a2e21863657df2f7735fb7e5f746e.

llvm-svn: 346944
2018-11-15 10:34:43 +00:00
Kadir Cetinkaya ed18e788f0 Revert "Address comments"
This reverts commit 19a39b14eab2b5339325e276262b177357d6b412.

llvm-svn: 346943
2018-11-15 10:34:39 +00:00
Kadir Cetinkaya 8b9fed3e8d Revert "Address comments."
This reverts commit b43c4d1c731e07172a382567f3146b3c461c5b69.

llvm-svn: 346942
2018-11-15 10:34:35 +00:00
Kadir Cetinkaya 2bed2cf791 Address comments.
llvm-svn: 346941
2018-11-15 10:31:23 +00:00
Kadir Cetinkaya 89a7691fd9 Address comments
llvm-svn: 346940
2018-11-15 10:31:19 +00:00
Kadir Cetinkaya cb8407ca89 clang-format
llvm-svn: 346939
2018-11-15 10:31:15 +00:00
Kadir Cetinkaya 3e5a47560c Introduce shard storage to auto-index.
Reviewers: sammccall, ioeric

Subscribers: ilya-biryukov, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54269

llvm-svn: 346938
2018-11-15 10:31:10 +00:00
Haojian Wu ee54a2b501 [clangd] Replace StringRef in SymbolLocation with a char pointer.
Summary:
This would save us 8 bytes per ref, and buy us ~40MB in total
for llvm index (from ~300MB to ~260 MB).

The char pointer must be null-terminated, and llvm::StringSaver
guarantees it.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53427

llvm-svn: 346852
2018-11-14 11:55:45 +00:00
Haojian Wu 172c045590 [clangd] Don't show all refs results if -name is ambiguous in dexp.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54430

llvm-svn: 346671
2018-11-12 16:41:15 +00:00
Haojian Wu 62fb2a216e [clangd] Allow symbols from AnyScope in dexp.
Summary:
We should allow symbols from any scope in dexp results, othewise
`find StringRef` doesn't return any results (llvm::StringRef).

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54427

llvm-svn: 346666
2018-11-12 16:03:59 +00:00
Eric Liu 961024f174 [clangd] Remember to serialize AnyScope in FuzzyFindRequest json.
llvm-svn: 346648
2018-11-12 12:24:08 +00:00
Haojian Wu f761a2c620 [clangd] Drop namespace references in the index.
Summary:
Namespace references is less useful compared with other symbols, and
they contribute large part of the index. This patch drops them.
The number of refs is reduced from 5.4 million to 4.7 million.

|           |  Before | After |
|file size  |  78 MB  |  71MB |
|memory     |  330MB  |  300MB|

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D54202

llvm-svn: 346319
2018-11-07 14:59:24 +00:00
Kadir Cetinkaya f84a7d8d4f [clangd] [NFC] Fix clang-tidy warnings.
Reviewers: ioeric, sammccall, ilya-biryukov, hokein

Subscribers: MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D54157

llvm-svn: 346308
2018-11-07 12:25:27 +00:00
Eric Liu b04869a4aa [clangd] Get rid of QueryScopes.empty() == AnyScope special case.
Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53933

llvm-svn: 346223
2018-11-06 11:08:17 +00:00
Eric Liu ad588af2d6 [clangd] auto-index stores symbols per-file instead of per-TU.
Summary:
This allows us to deduplicate header symbols across TUs. File digests
are collects when collecting symbols/refs. And the index store deduplicates
file symbols based on the file digest.

Reviewers: sammccall, hokein

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53433

llvm-svn: 346221
2018-11-06 10:55:21 +00:00
Kadir Cetinkaya 6675be8747 [clangd] Use thread pool for background indexing.
Reviewers: sammccall, ioeric

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D53651

llvm-svn: 345590
2018-10-30 12:13:27 +00:00
Kadir Cetinkaya b915790385 [clangd] Do not query index for new name completions.
Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D53192

llvm-svn: 345153
2018-10-24 15:24:29 +00:00
Haojian Wu 40d5684d41 [clangd] Hide position line and column fields.
Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53577

llvm-svn: 345134
2018-10-24 12:56:41 +00:00
Sam McCall 668ac94ba4 [clangd] Truncate SymbolID to 16 bytes.
Summary:
The goal is 8 bytes, which has a nonzero risk of collisions with huge indexes.
This patch should shake out any issues with truncation at all, we can lower
further later.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53587

llvm-svn: 345113
2018-10-24 06:58:42 +00:00
Eric Liu 0b70a87480 [clangd] Support URISchemes configuration in BackgroundIndex.
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53503

llvm-svn: 344912
2018-10-22 15:37:58 +00:00
Sam McCall 45b2754097 [clangd] Fix unqualified make_unique after r344850. NFC
llvm-svn: 344858
2018-10-20 17:40:12 +00:00
Sam McCall c008af6466 [clangd] Namespace style cleanup in cpp files. NFC.
Standardize on the most common namespace setup in our *.cpp files:
  using namespace llvm;
  namespace clang {
  namespace clangd {
  void foo(StringRef) { ... }
And remove redundant llvm:: qualifiers. (Except for cases like
make_unique where this causes problems with std:: and ADL).

This choice is pretty arbitrary, but some broad consistency is nice.
This is going to conflict with everything. Sorry :-/

Squash the other configurations:

A)
  using namespace llvm;
  using namespace clang;
  using namespace clangd;
  void clangd::foo(StringRef);
This is in some of the older files. (It prevents accidentally defining a
new function instead of one in the header file, for what that's worth).

B)
  namespace clang {
  namespace clangd {
  void foo(llvm::StringRef) { ... }
This is fine, but in practice the using directive often gets added over time.

C)
  namespace clang {
  namespace clangd {
  using namespace llvm; // inside the namespace
This was pretty common, but is a bit misleading: name lookup preferrs
clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using
directive is).

llvm-svn: 344850
2018-10-20 15:30:37 +00:00
Simon Pilgrim ad28838111 Fix MSVC "not all control paths return a value" warning. NFCI.
llvm-svn: 344844
2018-10-20 13:18:49 +00:00
Haojian Wu 812b6c51c3 [clangd] Remove the overflow log.
Summary:
LLVM codebase has generated files (all are build/Target/XXX/*.inc) that
exceed the MaxLine & MaxColumn. Printing these log would be noisy.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53400

llvm-svn: 344777
2018-10-19 08:35:24 +00:00
Krasimir Georgiev 9035420091 [clangd] Fix msan failure after r344735 by initializing bitfields
That revision changed integer members to bitfields; the integers were
default initialized before and the bitfields lost that default
initialization. This started causing msan use-of-uninitialized memory in
clangd tests.

llvm-svn: 344773
2018-10-19 06:05:32 +00:00
Haojian Wu 6ece6e7dad [clangd] Clear the semantic of RefSlab::size.
Summary:
The RefSlab::size can easily cause confusions, it returns the number of
different symbols, rahter than the number of all references.

- add numRefs() method and cache it, since calculating it everytime is nontrivial.
- clear misused places.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53389

llvm-svn: 344745
2018-10-18 15:33:20 +00:00
Eric Liu 4859738cfe [clangd] Names that are not spelled in source code are reserved.
Summary:
These are often not expected to be used directly e.g.
```
TEST_F(Fixture, X) {
  ^  // "Fixture_X_Test" expanded in the macro should be down ranked.
}
```

Only doing this for sema for now, as such symbols are mostly coming from sema
e.g. gtest macros expanded in the main file. We could also add a similar field
for the index symbol.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53374

llvm-svn: 344736
2018-10-18 12:23:05 +00:00
Haojian Wu b515fabb3b [clangd] Encode Line/Column as a 32-bits integer.
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).

Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.

For LLVM-project binary index file, we save ~13% memory.

| Before | After |
| 412MB  | 355MB |

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53363

llvm-svn: 344735
2018-10-18 10:43:50 +00:00
Haojian Wu c014d863be [clangd] Fix buildbot failure.
llvm-svn: 344680
2018-10-17 08:54:48 +00:00
Haojian Wu 0404855529 [clangd] Print numbers of symbols and refs as well when loading the
index.

llvm-svn: 344679
2018-10-17 08:48:04 +00:00
Haojian Wu 7dd4950ea5 [clangd] Collect refs from headers.
Summary:
Add a flag to SymbolCollector to collect refs fdrom headers.

Note that we collect refs from headers in static index, and we don't do it for
dynamic index because of the preamble (we skip function body in preamble,
collecting it will result incomplete results).

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53322

llvm-svn: 344678
2018-10-17 08:38:36 +00:00
Sam McCall bca624ab03 [clangd] Fix threading bugs in (not-yet-used) BackgroundIndex, re-enable test.
Summary:
One relatively boring bug: forgot to notify the CV after enqueue.

One much more fun bug: the thread member could access instance variables before
they were initialized. Although the thread was last in the init list, QueueCV
etc were listed after Thread in the class, so their default constructors raced
with the thread itself.
We have to get very unlucky to lose this race, I saw it 0.02% of the time.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D53313

llvm-svn: 344595
2018-10-16 09:05:13 +00:00
Sam McCall 96f2489557 [clangd] Optionally use dex for the preamble parts of the dynamic index.
Summary:
Reuse the old -use-dex-index experiment flag for this.

To avoid breaking the tests, make Dex deduplicate symbols, addressing an old FIXME.

Reviewers: hokein

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53288

llvm-svn: 344594
2018-10-16 08:53:52 +00:00
Sam McCall bc8aee15a2 [clangd] Revert include path change in Dexp. NFC
llvm-svn: 344533
2018-10-15 16:47:45 +00:00
Haojian Wu 397704ca40 [clangd] Add createIndex in dexp
Summary:
This would allow easily injecting our internal customization.

Also updates the stale "symbol-collection-file" flag.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53292

llvm-svn: 344521
2018-10-15 15:12:40 +00:00
Sam McCall 2b24ce61a0 [clangd] Use SyncAPI in more places in tests. NFC
llvm-svn: 344520
2018-10-15 15:04:03 +00:00
Sam McCall 8dc9dbb61a [clangd] Minimal implementation of automatic static index (not enabled).
Summary:
See tinyurl.com/clangd-automatic-index for design and goals.

Lots of limitations to keep this patch smallish, TODOs everywhere:
 - no serialization to disk
 - no changes to dynamic index, which now has a much simpler job
 - no partitioning of symbols by file to avoid duplication of header symbols
 - no reindexing of edited files
 - only a single worker thread
 - compilation database is slurped synchronously (doesn't scale)
 - uses memindex, rebuilds after every file (should be dex, periodically)

It's not hooked up to ClangdServer/ClangdLSPServer yet: the layering
isn't clear (it should really be in ClangdServer, but ClangdLSPServer
has all the CDB interactions).

Reviewers: ioeric

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, jfb, cfe-commits

Differential Revision: https://reviews.llvm.org/D53032

llvm-svn: 344513
2018-10-15 13:34:10 +00:00
Haojian Wu 82ba7121e8 [clangd] Remove an unused include header, NFC.
llvm-svn: 344510
2018-10-15 12:39:45 +00:00
Haojian Wu ddec850ceb [clangd] dump xrefs information in dexp tool.
Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53019

llvm-svn: 344508
2018-10-15 12:32:49 +00:00
Haojian Wu e83caccb58 [clangd] Fix some references missing in dynamic index.
Summary:
Previously, SymbolCollector postfilters all references at the end to
find all references of interesting symbols.
It was incorrect when indxing main AST where we don't see locations
of symbol declarations and definitions in the main AST (as those are in
preamble AST).

The fix is to do earily check during collecting references.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D53273

llvm-svn: 344507
2018-10-15 11:46:26 +00:00
Jonas Toth f79f8eecce [clangd] NFC fix semicolon warning
llvm-svn: 344384
2018-10-12 17:47:43 +00:00
Haojian Wu 292a36a0d5 [clangd] Fix an accident change in r342999.
llvm-svn: 344054
2018-10-09 15:16:14 +00:00
Jonas Toth 3acdd020b4 [clangd] fix miscompiling lower_bound call
llvm-svn: 344044
2018-10-09 13:24:50 +00:00
Kirill Bobyrev 4a5ff88fdb [clangd] NFC: Migrate to LLVM STLExtras API where possible
This patch improves readability by migrating `std::function(ForwardIt
start, ForwardIt end, ...)` to LLVM's STLExtras range-based equivalent
`llvm::function(RangeT &&Range, ...)`.

Similar change in Clang: D52576.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52650

llvm-svn: 343937
2018-10-07 14:49:41 +00:00
Sam McCall 5fb9746c49 [clangd] Remove last usage of ast matchers from SymbolCollector. NFC
llvm-svn: 343849
2018-10-05 14:03:04 +00:00
Sam McCall 50b89f0a9b [clangd] Simplify Dex query tree logic and fix missing-posting-list bug
Summary:
The bug being fixed: when a posting list doesn't exist in the index, it
was previously just dropped from the query rather than being treated as
empty. Now that we have the FALSE iterator, we can use it instead.

The query tree logic previously had a bunch of special cases to detect whether
subtrees are empty. Now we just naively build the whole tree, and rely
on the query optimizations to drop the trivial parts.

Finally, there was a bug in trigram generation: the empty query would
generate a single trigram "$$$" instead of no trigrams.
This had no effect (there was no posting list, so the other bug
cancelled it out). But we now have to fix this bug too.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52796

llvm-svn: 343802
2018-10-04 17:18:55 +00:00
Sam McCall aa728f1afa [clangd] Dex: FALSE iterator, peephole optimizations, fix AND bug
Summary:
The FALSE iterator will be used in a followup patch to fix a logic bug in Dex
(currently, tokens that don't have posting lists in the index are simply dropped
from the query, changing semantics).

It can usually be optimized away, so added the following opmitizations:
 - simplify booleans inside AND/OR
 - replace effectively-empty AND/OR with booleans
 - flatten nested AND/ORs

While working on this, found a bug in the AND iterator: its constructor sync()
assumes that ReachedEnd is set if applicable, but the constructor never sets it.
This crashes if a non-first iterator is nonempty.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52789

llvm-svn: 343801
2018-10-04 17:18:49 +00:00
Sam McCall 422f724618 [clangd] expose MergedIndex class
Summary:
This allows inheriting from it, so index() can ga away and allowing
TestTU::index) to be fixed.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52250

llvm-svn: 343780
2018-10-04 14:20:22 +00:00
Sam McCall cc21779c3c [clangd] clangd-indexer gathers refs and stores them in index files.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52531

llvm-svn: 343778
2018-10-04 14:09:55 +00:00
Sam McCall 2ec5a10db3 [clangd] Remove one-segment-skipping from Dex trigrams.
Summary:
Currently queries like "ab" can match identifiers like a_yellow_bee.
The value of allowing this for exactly one segment but no more seems dubious.
It costs ~3% of overall ram (~9% of posting list ram) and some quality.

Reviewers: ilya-biryukov, ioeric

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52885

llvm-svn: 343777
2018-10-04 14:08:11 +00:00
Sam McCall b5bbfef6cd [cland] Dex: fix/simplify short-trigram generation
Summary:
1) Instead of x$$ for a short-query trigram, just use x
2) Make rules more coherent: prefixes of length 1-2, and first char + next head
3) Fix Dex::fuzzyFind to mark results as incomplete, because
   short-trigram rules only yield a subset of results.

Reviewers: ioeric

Subscribers: ilya-biryukov, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52808

llvm-svn: 343775
2018-10-04 14:01:55 +00:00
Sam McCall 87f69eaf4e [clangd] Dex: FALSE iterator, peephole optimizations, fix AND bug
Summary:
The FALSE iterator will be used in a followup patch to fix a logic bug in Dex
(currently, tokens that don't have posting lists in the index are simply dropped
from the query, changing semantics).

It can usually be optimized away, so added the following opmitizations:
 - simplify booleans inside AND/OR
 - replace effectively-empty AND/OR with booleans
 - flatten nested AND/ORs

While working on this, found a bug in the AND iterator: its constructor sync()
assumes that ReachedEnd is set if applicable, but the constructor never sets it.
This crashes if a non-first iterator is nonempty.

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52789

llvm-svn: 343774
2018-10-04 13:12:23 +00:00
Sam McCall d9eae39800 [clangd] Support refs() in dex. Largely cloned from MemIndex.
Reviewers: hokein

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52726

llvm-svn: 343760
2018-10-04 09:16:12 +00:00
Sam McCall 41e6d76c22 [clangd] clangd-indexer: Drop support for MR-via-YAML
Summary:
It's slow, and the open-source reduce implementation doesn't scale properly.
While here, tidy up some dead headers and comments.

Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D52517

llvm-svn: 343759
2018-10-04 08:30:03 +00:00
Sam McCall a659d779f8 Reland r343589 "[clangd] Dex: add Corpus factory for iterators, rename, fold constant. NFC""
This reverts commit r343610.

llvm-svn: 343622
2018-10-02 19:59:23 +00:00
Reid Kleckner 2b5259afb3 Revert r343589 "[clangd] Dex: add Corpus factory for iterators, rename, fold constant. NFC"
Declaring a field with the same name as a type causes GCC to error out:

Dex.h:104:10: error: declaration of 'clang::clangd::dex::Corpus clang::clangd::dex::Dex::Corpus' [-fpermissive]
   Corpus Corpus;
          ^
Iterator.h:127:7: error: changes meaning of 'Corpus' from 'class clang::clangd::dex::Corpus' [-fpermissive]
 class Corpus {

llvm-svn: 343610
2018-10-02 17:31:43 +00:00
Sam McCall 51be55d0ec [clangd] Zap TODONEs
llvm-svn: 343590
2018-10-02 13:51:43 +00:00
Sam McCall a1e7385d5c [clangd] Dex: add Corpus factory for iterators, rename, fold constant. NFC
Summary:
- Corpus avoids having to pass size to the true iterator, and (soon) any
  iterator that might optimize down to true.
- Shorten names of factory functions now they're scoped to the Corpus.
  intersect() and unionOf() rather than createAnd() or createOr() as this
  seems to read better to me, and fits with other short names. Opinion wanted!
- DEFAULT_BOOST_SCORE --> 1. This is a multiplier, don't obfuscate identity.
- Simplify variadic templates in Iterator.h

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52711

llvm-svn: 343589
2018-10-02 13:44:26 +00:00
Sam McCall 7402836042 [clangd] Dex iterator printer shows query structure, not iterator state.
Summary:
This makes it suitable for logging (which immediately found a bug, to
be fixed in the next patch...)

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52715

llvm-svn: 343580
2018-10-02 11:51:36 +00:00
Sam McCall 329fc143fd [clangd] Query dex index using query-style trigrams, not identifier-style trigrams
llvm-svn: 343453
2018-10-01 10:42:51 +00:00
Eric Liu d5d6a60a78 [clangd] Fix header mapping for std::string. NFC
Some implementation has std::string declared in <iosfwd>.

llvm-svn: 343448
2018-10-01 08:50:49 +00:00
Eric Liu 670c147d83 [clangd] Initial supoprt for cross-namespace global code completion.
Summary:
When no scope qualifier is specified, allow completing index symbols
from any scope and insert proper automatically. This is still experimental and
hidden behind a flag.

Things missing:
- Scope proximity based scoring.
- FuzzyFind supports weighted scopes.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: kbobyrev, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52364

llvm-svn: 343248
2018-09-27 18:46:00 +00:00
Eric Liu ee7fe93fa8 [clangd] Add more tracing to index queries. NFC
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52611

llvm-svn: 343247
2018-09-27 18:23:23 +00:00
Kirill Bobyrev ea4f20c6be [clangd] Fix bugs with incorrect memory estimate report
* With the current implementation, `sizeof(std::vector<Chunk>)` is added
twice to the `Dex` memory estimate which is incorrect
* `Dex` logs memory usage estimation before `BackingDataSize` is set and
hence the log report excludes size of the external `SymbolSlab` which is
coupled with `Dex` instance

Reviewed By: ioeric

Differential Revision: https://reviews.llvm.org/D52503

llvm-svn: 343117
2018-09-26 15:06:23 +00:00
Kirill Bobyrev 0cdf629394 [docs] Update PostingList string representation format
Because `PostingList` objects are compressed, it is now impossible to
see elements other than the current one and the documentation doesn't
match implementation anymore.

Reviewed By: ioeric

Differential Revision: https://reviews.llvm.org/D52545

llvm-svn: 343116
2018-09-26 14:59:49 +00:00
Simon Pilgrim 3462e76ba5 Removed extra semicolon to fix Wpedantic. (NFCI).
llvm-svn: 343083
2018-09-26 09:02:45 +00:00
Sam McCall 321d5d4802 [clangd] Extract mapper logic from clangd-indexer into a library.
Summary: Soon we can drop support for MR-via-YAML.
I need to modify some out-of-tree versions to use the library, first.

Reviewers: kadircet

Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D52465

llvm-svn: 343019
2018-09-25 20:02:36 +00:00
Sam McCall e38c7f8d96 [clangd] Fix reversed RIFF/YAML serialization
llvm-svn: 343017
2018-09-25 19:53:33 +00:00
Sam McCall 02d600d267 [clangd] Merge binary + YAML serialization behind a (mostly) common interface.
Summary:
Interface is in one file, implementation in two as they have little in common.
A couple of ad-hoc YAML functions left exposed:
 - symbol -> YAML I expect to keep for tools like dexp
 - YAML -> symbol is used for the MR-style indexer, I think we can eliminate
   this (merge-on-the-fly, else use a different serialization)

Reviewers: kbobyrev

Subscribers: mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52453

llvm-svn: 342999
2018-09-25 18:06:43 +00:00
Kirill Bobyrev d041f8a9d0 [clangd] NFC: Simplify code, enforce LLVM Coding Standards
For consistency, functional-style code pieces are replaced with their
simple counterparts to improve readability.

Also, file headers are fixed to comply with LLVM Coding Standards.

`static` member of anonymous namespace is not marked `static` anymore,
because it is redundant.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52466

llvm-svn: 342974
2018-09-25 13:58:48 +00:00
Kirill Bobyrev 69e6388564 [clangd] Fix some buildbots after r342965
Some compilers fail to parse struct default member initializer.

llvm-svn: 342970
2018-09-25 13:14:11 +00:00
Kirill Bobyrev 6c2f5bd0f1 [clangd] Implement VByte PostingList compression
This patch implements Variable-length Byte compression of `PostingList`s
to sacrifice some performance for lower memory consumption.

`PostingList` compression and decompression was extensively tested using
fuzzer for multiple hours and runnning significant number of realistic
`FuzzyFindRequests`. AddressSanitizer and UndefinedBehaviorSanitizer
were used to ensure the correct behaviour.

Performance evaluation was conducted with recent LLVM symbol index (292k
symbols) and the collection of user-recorded queries (7751
`FuzzyFindRequest` JSON dumps):

| Metrics | Before| After | Change (%)
| -----  | -----  | -----   | -----
| Memory consumption (posting lists only), MB  |  54.4 | 23.5 | -60%
| Time to process queries, sec | 7.70 | 9.4 | +25%

Reviewers: sammccall, ioeric

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52300

llvm-svn: 342965
2018-09-25 11:54:51 +00:00
Sam McCall 3ca9759a21 [clangd] Fix uninit bool in r342888
llvm-svn: 342903
2018-09-24 16:52:48 +00:00
Sam McCall 8fb7bb2482 [clangd] Do bounds checks while reading data, otherwise var-length records are too painful. NFC
llvm-svn: 342888
2018-09-24 14:51:15 +00:00
Kirill Bobyrev 94af0612e0 [clangd] Force Dex to respect symbol collector flags
`Dex` should utilize `FuzzyFindRequest.RestrictForCodeCompletion` flags
and omit symbols not meant for code completion when asked for it.

The measurements below were conducted with setting
`FuzzyFindRequest.RestrictForCodeCompletion` to `true` (so that it's
more realistic). Sadly, the average latency goes down, I suspect that is
mostly because of the empty queries where the number of posting lists is
critical.

| Metrics  | Before | After | Relative difference
| -----  | -----  | -----   | -----
| Cumulative query latency (7000 `FuzzyFindRequest`s over LLVM static index)  | 6182735043 ns    | 7202442053 ns | +16%
| Whole Index size | 81.24 MB    | 81.79 MB | +0.6%

Out of 292252 symbols collected from LLVM codebase 136926 appear to be
restricted for code completion.

Reviewers: ioeric

Differential Revision: https://reviews.llvm.org/D52357

llvm-svn: 342866
2018-09-24 08:45:18 +00:00
Eric Liu c275fb2a5d [clangd] Remember to serialize symbol origin in YAML.
llvm-svn: 342730
2018-09-21 13:04:57 +00:00
Eric Liu 467c5f9ce0 [clangd] Store preamble macros in dynamic index.
Summary:
Pros:
o Loading macros from preamble for every completion is slow (see profile).
o Calculating macro USR is also slow (see profile).
o Sema can provide a lot of macro completion results (e.g. when filter is empty,
60k for some large TUs!).

Cons:
o Slight memory increase in dynamic index (~1%).
o Some extra work during preamble build (should be fine as preamble build and
indexAST is way slower).

Before:
{F7195645}

After:
{F7195646}

Reviewers: ilya-biryukov, sammccall

Reviewed By: sammccall

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52078

llvm-svn: 342529
2018-09-19 09:35:04 +00:00
Sam McCall 46b5555844 [clangd] Fix error handling for SymbolID parsing (notably YAML and dexp)
llvm-svn: 342505
2018-09-18 19:00:59 +00:00
Eric Liu 764f461f9c [clangd] Get rid of Decls parameter in indexMainDecls. NFC
It's already available in ParsedAST.

llvm-svn: 342473
2018-09-18 13:35:16 +00:00
Eric Liu 821a116818 [clangd] Merge ClangdServer::DynamicIndex into FileIndex. NFC.
Summary:
FileIndex now provides explicit interfaces for preamble and main file updates.
This avoids growing parameter list when preamble and main symbols diverge
further (e.g. D52078). This also gets rid of the hack in `indexAST` that
inferred main file index based on `TopLevelDecls`.

Also separate `indexMainDecls` from `indexAST`.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D52222

llvm-svn: 342460
2018-09-18 10:30:44 +00:00
Sam McCall 3bf9b6d920 [clangd] dexp tool uses llvm::cl to parse its flags.
Summary:
We can use cl::ResetCommandLineParser() to support different types of
command-lines, as long as we're careful about option lifetimes.
(I tried using subcommands, but the error messages were bad)
I found a mostly-reasonable pattern to isolate the fiddly parts.

Added -scope and -limit flags to the `find` command to demonstrate.
(Note that scope support seems to be broken in dex?)

Fixed symbol lookup to parse symbol IDs.

Caveats:
 - with command help (e.g. `find -help`), you also get some spam
   about required arguments. This is a bug in llvm::cl, which prints
   these to errs() rather than the designated stream.

Reviewers: kbobyrev

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51989

llvm-svn: 342456
2018-09-18 09:49:57 +00:00
Eric Liu f736766659 [clangd] Adapt API change after 342451.
llvm-svn: 342452
2018-09-18 08:52:14 +00:00
Eric Liu a57afd091f [clangd] Get rid of AST matchers in SymbolCollector. NFC
Reviewers: ilya-biryukov, kadircet

Subscribers: MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D52089

llvm-svn: 342362
2018-09-17 07:43:49 +00:00
Kirill Bobyrev 249c5864cf [clangd] Introduce PostingList interface
This patch abstracts `PostingList` interface and reuses existing
implementation. It will be used later to test different `PostingList`
representations.

No functionality change is introduced, this patch is mostly refactoring
so that the following patches could focus on functionality while not
being too hard to review.

Reviewed By: sammccall, ioeric

Differential Revision: https://reviews.llvm.org/D51982

llvm-svn: 342155
2018-09-13 17:11:03 +00:00
Kirill Bobyrev bd72b08eb3 [clangd] Fix Dexp build
%s/MaxCandidateCount/Limit/g after rL342138.

llvm-svn: 342143
2018-09-13 15:35:55 +00:00
Kirill Bobyrev e6dd0806c7 [clangd] Cleanup FuzzyFindRequest filtering limit semantics
As discussed during D51860 review, it is better to use `llvm::Optional`
here as it has clear semantics which reflect intended behavior.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D52028

llvm-svn: 342138
2018-09-13 14:27:03 +00:00
Kirill Bobyrev d9f33b129c [clangd] Don't create child AND and OR iterators with one posting list
`AND( AND( Child ) ... )` -> `AND( Child ... )`
`AND( OR( Child ) ... )` -> `AND( Child ... )`

This simple optimization results in 5-6% performance improvement in the
benchmark with 2000 serialized `FuzzyFindRequest`s.

Reviewed By: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D52016

llvm-svn: 342124
2018-09-13 10:02:48 +00:00
Heejin Ahn 386d272387 [clangd] Add missing clangBasic target_link_libraries
Without this, builds with `-DSHARED_LIB=ON` fail.

llvm-svn: 342037
2018-09-12 09:40:13 +00:00
Kirill Bobyrev e1e19c7b75 [clangd] Implement a Proof-of-Concept tool for symbol index exploration
Reviewed By: sammccall, ilya-biryukov

Differential Revision: https://reviews.llvm.org/D51628

llvm-svn: 342025
2018-09-12 07:32:54 +00:00
Kirill Bobyrev 0dee397e06 [clangd] NFC: Use uint32_t for FuzzyFindRequest limits
Reviewed By: ioeric

Differential Revision: https://reviews.llvm.org/D51860

llvm-svn: 341921
2018-09-11 10:31:38 +00:00
Kirill Bobyrev 5faf8a3d84 [clangd] Unbreak buildbots after r341802
Solution: use std::move when returning result from toJSON(...).
llvm-svn: 341832
2018-09-10 14:31:38 +00:00
Kirill Bobyrev 09f00dcf69 [clangd] Implement FuzzyFindRequest JSON (de)serialization
JSON (de)serialization of `FuzzyFindRequest` might be useful for both
D51090 and D51628. Also, this allows precise logging of the fuzzy find
requests.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51852

llvm-svn: 341802
2018-09-10 11:51:05 +00:00
Kirill Bobyrev 38a889c185 [clangd] Add symbol slab size to index memory consumption estimates
Currently, `SymbolIndex::estimateMemoryUsage()` returns the "overhead"
estimate, i.e. the estimate of the Index data structure excluding
backing data (such as Symbol Slab and Reference Slab). This patch
propagates information about paired data size where necessary.

Reviewed By: ioeric, sammccall

Differential Revision: https://reviews.llvm.org/D51539

llvm-svn: 341800
2018-09-10 11:46:07 +00:00
Kirill Bobyrev 5abe478a3d [clangd] NFC: Rename DexIndex to Dex
Also, cleanup some redundant includes.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51774

llvm-svn: 341784
2018-09-10 08:23:53 +00:00
Kirill Bobyrev 59491a1fa9 [clangd] Make advanceTo() faster on Posting Lists
If the current element is already beyond advanceTo()'s DocID, just
return instead of doing binary search. This simple optimization saves up
to 6-7% performance,

Reviewed By: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D51802

llvm-svn: 341781
2018-09-10 07:57:28 +00:00
Eric Liu f76886859f [clangd] Canonicalize include paths in clangd.
Get rid of "../"  and "../../".

llvm-svn: 341645
2018-09-07 09:40:36 +00:00
Eric Liu 6df66001ee [clangd] Add "Deprecated" field to Symbol and CodeCompletion.
Summary: Also set "deprecated" field in LSP CompletionItem.

Reviewers: sammccall, kadircet

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D51724

llvm-svn: 341576
2018-09-06 18:52:26 +00:00
Kirill Bobyrev 049b2d4345 [clangd] Fix Dex initialization
This patch sets URI schemes of Dex to SymbolCollector's default schemes
in case callers tried to pass empty list of schemes. This was the case
for initialization in Clangd main and was a reason of incorrect
behavior.

Also, it fixes a bug with missed `continue;` after spotting invalid URI
scheme conversion.

llvm-svn: 341552
2018-09-06 15:10:10 +00:00
Kirill Bobyrev afbf31854d [clangd] NFC: Use TopN instead of std::priority_queue
Quality.cpp defines a structure for convenient storage of Top N items,
it should be used instead of the `std::priority_queue` with slightly
obscure semantics.

This patch does not affect functionality.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51676

llvm-svn: 341544
2018-09-06 13:15:03 +00:00
Kirill Bobyrev e4ee0213d4 [clangd] NFC: mark single-parameter constructors explicit
Code health: prevent implicit conversions to user-defined types.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51690

llvm-svn: 341543
2018-09-06 13:06:04 +00:00
Kirill Bobyrev 19a9461e5f [clangd] Implement proximity path boosting for Dex
This patch introduces `PathURI` Search Token kind and utilizes it to
uprank symbols which are defined in files with small distance to the
directory where the fuzzy find request is coming from (e.g. files user
is editing).

Reviewed By: ioeric

Reviewers: ioeric, sammccall

Differential Revision: https://reviews.llvm.org/D51481

llvm-svn: 341542
2018-09-06 12:54:43 +00:00
Eric Liu d25f1214a8 [clangd] Set SymbolID for sema macros so that they can be merged with index macros.
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51688

llvm-svn: 341534
2018-09-06 09:59:37 +00:00
Sam McCall e4fa7b8418 [clangd] make zlib compression optional for binary format
llvm-svn: 341465
2018-09-05 13:17:47 +00:00
Sam McCall d85264bf53 [clangd] Fix buildbot failures on older compilers from r341375
llvm-svn: 341451
2018-09-05 07:52:49 +00:00
Sam McCall 76c4c3af52 [clangd] Load static index asynchronously, add tracing.
Summary:
Like D51475 but simplified based on recent patches.
While here, clarify that loadIndex() takes a filename, not file content.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51638

llvm-svn: 341376
2018-09-04 16:19:40 +00:00
Sam McCall 50f3631057 [clangd] Define a compact binary serialization fomat for symbol slab/index.
Summary:
This is intended to replace the current YAML format for general use.
It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML:
  llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M
It's also simpler/faster to read and write.

The format is a RIFF container (chunks of (type, size, data)) with:
 - a compressed string table
 - simple binary encoding of symbols (with varints for compactness)
It can be extended to include occurrences, Dex posting lists, etc.

There's no rich backwards-compatibility scheme, but a version number is included
so we can detect incompatible files and do ad-hoc back-compat.

Alternatives considered:
 - compressed YAML or JSON: bulky and slow to load
 - llvm bitstream: confusing model and libraries are hard to use. My attempt
   produced slightly larger files, and the code was longer and slower.
 - protobuf or similar: would be really nice (esp for back-compat) but the
   dependency is a big hassle
 - ad-hoc binary format without a container: it seems clear we're going
   to add posting lists and occurrences here, and that they will benefit
   from sharing a string table. The container makes it easy to debug
   these pieces in isolation, and make them optional.

Reviewers: ioeric

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51585

llvm-svn: 341375
2018-09-04 16:16:50 +00:00
Kirill Bobyrev cc8b507a60 [clangd] NFC: Change quality type to float
Reviewed by: sammccall

Differential Revision: https://reviews.llvm.org/D51636

llvm-svn: 341374
2018-09-04 15:45:56 +00:00
Kirill Bobyrev d5bc65444c [clangd] Move buildStaticIndex() to SymbolYAML
`buildStaticIndex()` is used by two other tools that I'm building, now
it's useful outside of `tool/ClangdMain.cpp`.

Also, slightly refactor the code while moving it to the different source
file.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D51626

llvm-svn: 341369
2018-09-04 15:10:40 +00:00
Sam McCall b0138317d6 [clangd] SymbolOccurrences -> Refs and cleanup
Summary:
A few things that I noticed while merging the SwapIndex patch:
 - SymbolOccurrences and particularly SymbolOccurrenceSlab are unwieldy names,
   and these names appear *a lot*. Ref, RefSlab, etc seem clear enough
   and read/format much better.
 - The asymmetry between SymbolSlab and RefSlab (build() vs freeze()) is
   confusing and irritating, and doesn't even save much code.
   Avoiding RefSlab::Builder was my idea, but it was a bad one; add it.
 - DenseMap<SymbolID, ArrayRef<Ref>> seems like a reasonable compromise for
   constructing MemIndex - and means many less wasted allocations than the
   current DenseMap<SymbolID, vector<Ref*>> for FileIndex, and none for
   slabs.
 - RefSlab::find() is not actually used for anything, so we can throw
   away the DenseMap and keep the representation much more compact.
 - A few naming/consistency fixes: e.g. Slabs,Refs -> Symbols,Refs.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51605

llvm-svn: 341368
2018-09-04 14:39:56 +00:00
Sam McCall dd4a24c86c [clangd] Fix index-twice regression from r341242
llvm-svn: 341337
2018-09-03 20:26:26 +00:00
Sam McCall 9c7624e14b [clangd] Factor out the data-swapping functionality from MemIndex/DexIndex.
Summary:
This is now handled by a wrapper class SwapIndex, so MemIndex/DexIndex can be
immutable and focus on their job.

Old and busted:
 I have a MemIndex, which holds a shared_ptr<vector<Symbol*>>, which keeps the
 symbol slab alive. I update by calling build(shared_ptr<vector<Symbol*>>).

New hotness: I have a SwapIndex, which holds a unique_ptr<SymbolIndex>, which
 holds a MemIndex, which holds a shared_ptr<void>, which keeps backing
 data alive.
 I update by building a new MemIndex and calling SwapIndex::reset().

Reviewers: kbobyrev, ioeric

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51422

llvm-svn: 341318
2018-09-03 14:37:43 +00:00
Eric Liu 83f63e42b2 [clangd] Support multiple #include headers in one symbol.
Summary:
Currently, a symbol can have only one #include header attached, which
might not work well if the symbol can be imported via different #includes depending
on where it's used. This patch stores multiple #include headers (with # references)
for each symbol, so that CodeCompletion can decide which include to insert.

In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51291

llvm-svn: 341304
2018-09-03 10:18:21 +00:00
Fangrui Song 399943bc76 [clangd] Fix many typos. NFC
llvm-svn: 341273
2018-09-01 07:47:03 +00:00
Haojian Wu e8064b6f6d [clangd] Implement findOccurrences interface in dynamic index.
Summary:
Implement the interface in
  - FileIndex
  - MemIndex
  - MergeIndex

Depends on https://reviews.llvm.org/D50385.

Reviewers: sammccall, ilya-biryukov

Reviewed By: sammccall

Subscribers: mgrang, ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51279

llvm-svn: 341242
2018-08-31 19:53:37 +00:00
Sam McCall 2e5700f038 [clangd] Flatten out Symbol::Details. It was ill-conceived, sorry.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51504

llvm-svn: 341211
2018-08-31 13:55:01 +00:00
Haojian Wu d81e3146e3 [clangd] Collect symbol occurrences in SymbolCollector.
SymbolCollector will be used for two cases:
 - collect Symbol type only, used for indexing preamble AST.
 - collect Symbol and SymbolOccurrences, used for indexing main AST.

For finding local references from the AST, we will implement it in other ways.

llvm-svn: 341208
2018-08-31 12:54:13 +00:00
Kirill Bobyrev 493b1627ca [NFC] Cleanup Dex
* Use consistent assertion messages in iterators implementations
* Silence a bunch of clang-tidy warnings: use `emplace_back` instead of
  `push_back` where possible, make sure arguments have the same name in
  header and implementation file, use for loop over ranges where possible

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D51528

llvm-svn: 341190
2018-08-31 09:17:02 +00:00
Kirill Bobyrev a2f146fd9c [clangd] Remove UB introduced in rL341057
llvm-svn: 341066
2018-08-30 13:30:34 +00:00
Kirill Bobyrev 38bdac5db8 [clangd] Implement iterator cost
This patch introduces iterator cost concept to improve the performance
of Dex query iterators (mainly, AND iterator). Benchmarks show that the
queries become ~10% faster.

Before

```
-------------------------------------------------------
Benchmark                Time           CPU Iteration
-------------------------------------------------------
DexAdHocQueries    5883074 ns    5883018 ns        117
DexRealQ         959904457 ns  959898507 ns          1
```

After

```
-------------------------------------------------------
Benchmark                Time           CPU Iteration
-------------------------------------------------------
DexAdHocQueries    5238403 ns    5238361 ns        130
DexRealQ         873275207 ns  873269453 ns          1
```

Reviewed by: sammccall

Differential Revision: https://reviews.llvm.org/D51310

llvm-svn: 341057
2018-08-30 11:23:58 +00:00
Kirill Bobyrev b217ddb1bb [clangd] Use TRUE iterator instead of complete posting list
Stop using `$$$` (empty) trigram and generating a posting list with all
items. Since TRUE iterator is already implemented and correctly inserted
when there are no real trigram posting lists, this is a valid
transformation.

Benchmarks show that this simple change allows ~30% speedup on dataset
of real completion queries.

Before

```
-------------------------------------------------------
Benchmark                Time           CPU Iterations
-------------------------------------------------------
DexAdHocQueries    5640321 ns    5640265 ns        120
DexRealQ         939835603 ns  939830296 ns          1
```

After

```
-------------------------------------------------------
Benchmark                Time           CPU Iterations
-------------------------------------------------------
DexAdHocQueries    3452014 ns    3451987 ns        203
DexRealQ         667455912 ns  667455750 ns          1
```

Reviewed by: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D51287

llvm-svn: 340729
2018-08-27 09:47:50 +00:00
Kirill Bobyrev a98961bc84 [clangd] Implement LIMIT iterator
This patch introduces LIMIT iterator, which is very important for
improving the quality of search query. LIMIT iterators can be applied on
top of BOOST iterators to prevent populating query request with a huge
number of low-quality symbols.

Reviewed by: sammccall

Differential Revision: https://reviews.llvm.org/D51029

llvm-svn: 340605
2018-08-24 11:25:43 +00:00
Eric Liu 25d74e9594 [clangd] Speculative code completion index request before Sema is run.
Summary:
For index-based code completion, send an asynchronous speculative index
request, based on the index request for the last code completion on the same
file and the filter text typed before the cursor, before sema code completion
is invoked. This can reduce the code completion latency (by roughly latency of
sema code completion) if the speculative request is the same as the one
generated for the ongoing code completion from sema. As a sequence of code
completions often have the same scopes and proximity paths etc, this should be
effective for a number of code completions.

Trace with speculative index request:{F6997544}

Reviewers: hokein, ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: javed.absar, jfb, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D50962

llvm-svn: 340604
2018-08-24 11:23:56 +00:00
Kirill Bobyrev fc89001cec [clangd] Log memory usage of DexIndex and MemIndex
This patch prints information about built index size estimation to
verbose logs. This is useful for optimizing memory usage of DexIndex and
comparisons with MemIndex.

Reviewed by: sammccall

Differential Revision: https://reviews.llvm.org/D51154

llvm-svn: 340601
2018-08-24 09:12:54 +00:00
Ilya Biryukov 22abe49fff [clangd] Get rid of regexes in CanonicalIncludes
Summary: Replace them with suffix mappings.

Reviewers: ioeric, kbobyrev

Reviewed By: ioeric

Subscribers: MaskRay, jkorous, arphaman, jfb, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D51088

llvm-svn: 340410
2018-08-22 13:51:19 +00:00
Kirill Bobyrev 7413e985ea [clangd] Implement BOOST iterator
This patch introduces BOOST iterator - a substantial block for efficient
and high-quality symbol retrieval. The concept of boosting allows
performing computationally inexpensive scoring on the query side so that
the final (expensive) scoring can only be applied on the items with the
highest preliminary score while eliminating the need to score too many
items.

Reviewed by: ilya-biryukov

Differential Revision: https://reviews.llvm.org/D50970

llvm-svn: 340409
2018-08-22 13:44:15 +00:00
Ilya Biryukov 8343baf686 [clangd] Make FileIndex aware of the main file
Summary:
It was previously only indexing the preamble decls. The new
implementation will index both the preamble and the main AST and
report both sets of symbols, preferring the ones from the main AST
whenever the symbol is present in both.
The symbols in the main AST slab always store all information
available in the preamble symbols, possibly adding more,
e.g. definition locations.

Reviewers: hokein, ioeric

Reviewed By: ioeric

Subscribers: kadircet, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D50889

llvm-svn: 340404
2018-08-22 12:43:17 +00:00
Kirill Bobyrev 14e4700468 [clangd] Cleanup after D50897
The wrong diff that was uploaded to Phabricator was building the wrong
index.

llvm-svn: 340388
2018-08-22 07:17:59 +00:00
Kirill Bobyrev 7a94c918a0 [clangd] Allow using experimental Dex index
This patch adds hidden Clangd flag ("use-dex-index") which replaces
(currently) default `MemIndex` with `DexIndex` for the static index.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50897

llvm-svn: 340262
2018-08-21 10:32:27 +00:00
Kirill Bobyrev 870aaf2963 [clangd] DexIndex implementation prototype
This patch is a proof-of-concept Dex index implementation. It has
several flaws, which don't allow replacing static MemIndex yet, such as:

* Not being able to handle queries of small size (less than 3 symbols);
  a way to solve this is generating trigrams of smaller size and having
  such incomplete trigrams in the index structure.
* Speed measurements: while manually editing files in Vim and requesting
  autocompletion gives an impression that the performance is at least
  comparable with the current static index, having actual numbers is
  important because we don't want to hurt the users and roll out slow
  code. Eric (@ioeric) suggested that we should only replace MemIndex as
  soon as we have the evidence that this is not a regression in terms of
  performance. An approach which is likely to be successful here is to
  wait until we have benchmark library in the LLVM core repository, which
  is something I have suggested in the LLVM mailing lists, received
  positive feedback on and started working on. I will add a dependency as
  soon as the suggested patch is out for a review (currently there's at
  least one complication which is being addressed by
  https://github.com/google/benchmark/pull/649). Key performance
  improvements for iterators are sorting by cost and the limit iterator.
* Quality measurements: currently, boosting iterator and two-phase
  lookup stage are not implemented, without these the quality is likely to
  be worse than the current implementation can yield. Measuring quality is
  tricky, but another suggestion in the offline discussion was that the
  drop-in replacement should only happen after Boosting iterators
  implementation (and subsequent query enhancement).

The proposed changes do not affect Clangd functionality or performance,
`DexIndex` is only used in unit tests and not in production code.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50337

llvm-svn: 340175
2018-08-20 14:39:32 +00:00
Haojian Wu 931b2262d4 [clangd] Simplify the code using UniqueStringSaver, NFC.
llvm-svn: 340161
2018-08-20 09:47:12 +00:00
Kirill Bobyrev 6d8bd7f56a [clangd] NFC: Cleanup Dex Iterator comments and simplify tests
Proposed changes:

* Cleanup comments in `clangd/index/dex/Iterator.h`: Vim's `gq`
  formatting added redundant spaces instead of newlines in few
  places
* Few comments in `OrIterator` are wrong
* Use `EXPECT_TRUE(Condition)` instead of
  `EXPECT_THAT(Condition, true)` (same with `EXPECT_FALSE`)
* Don't expose `dump()` method to the public by misplacing
  `private:`

This patch does not affect functionality.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50956

llvm-svn: 340157
2018-08-20 09:16:14 +00:00
Haojian Wu 02465baea2 [clangd] Add missing lock in the lookup.
Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Differential Revision: https://reviews.llvm.org/D50960

llvm-svn: 340156
2018-08-20 09:07:59 +00:00
Kirill Bobyrev 30ffdf42f7 [clangd] Implement TRUE Iterator
This patch introduces TRUE Iterator which efficiently handles posting
lists containing all items within `[0, Size)` range.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50955

llvm-svn: 340155
2018-08-20 08:47:30 +00:00
Kirill Bobyrev 51534ab864 [clangd] NFC: Improve Dex Iterators debugging traits
This patch improves `dex::Iterator` string representation by
incorporating the information about the element which is currently being
pointed to by the `DocumentIterator`.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50689

llvm-svn: 339877
2018-08-16 13:19:43 +00:00
Kirill Bobyrev 8e35f1e7cb NFC: Enforce good formatting across multiple clang-tools-extra files
This patch improves readability of multiple files in clang-tools-extra
and enforces LLVM Coding Guidelines.

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50707

llvm-svn: 339687
2018-08-14 16:03:32 +00:00
Simon Pilgrim 7b31fae983 Fix MSVC 'std::min: no matching overloaded function found' error.
llvm-svn: 339557
2018-08-13 12:24:48 +00:00
Kirill Bobyrev ff2dd9095f [clangd] Generate incomplete trigrams for the Dex index
This patch handles trigram generation "short" identifiers and queries.
Trigram generator produces incomplete trigrams for short names so that
the same query iterator API can be used to match symbols which don't
have enough symbols to form a trigram and correctly handle queries which
also are not sufficient for generating a full trigram.

Reviewed by: ioeric

Differential revision: https://reviews.llvm.org/D50517

llvm-svn: 339548
2018-08-13 08:57:06 +00:00
Kirill Bobyrev 0a75766c3d [clangd] Allow consuming limited number of items
This patch modifies `consume` function to allow retrieval of limited
number of symbols. This is the "cheap" implementation of top-level
limiting iterator. In the future we would like to have a complete limit
iterator implementation to insert it into the query subtrees, but in the
meantime this version would be enough for a fully-functional
proof-of-concept Dex implementation.

Reviewers: ioeric, ilya-biryukov

Reviewed by: ioeric

Differential Revision: https://reviews.llvm.org/D50500

llvm-svn: 339426
2018-08-10 11:50:44 +00:00
Haojian Wu c6ddb46162 [clangd] Share getSymbolID implementation.
Summary: And remove all duplicated implementation.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D50375

llvm-svn: 339116
2018-08-07 08:57:52 +00:00
Haojian Wu 65ac321092 [clangd] Index Interfaces for Xrefs
Summary:
This is the first step of implementing Xrefs in clangd:
  - add index interfaces, and related data structures.

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D49658

llvm-svn: 339011
2018-08-06 13:14:32 +00:00
Haojian Wu 1ffd6b222a [clangd] Make SymbolLocation => bool conversion explicitly.
Summary:
The implicit bool conversion could happen superisingly, e.g. when
checking `if (Loc1 == Loc2)`, the compiler will convert SymbolLocation to
bool before comparing (because we don't define operator `==` for SymbolLocation).

Reviewers: sammccall

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D49657

llvm-svn: 338517
2018-08-01 11:24:50 +00:00
Kirill Bobyrev a522c1cf86 [clangd] Return Dex Iterators
The original Dex Iterators patch (https://reviews.llvm.org/rL338017)
caused problems for Clang 3.6 and Clang 3.7 due to the compiler bug
which prevented inferring template parameter (`Size`) in create(And|Or)?
functions. It was reverted in https://reviews.llvm.org/rL338054.

In this revision the mentioned helper functions were replaced with
variadic templated versions.

Proposed changes were tested on multiple compiler versions, including
Clang 3.6 which originally caused the failure.

llvm-svn: 338116
2018-07-27 09:54:27 +00:00
Kirill Bobyrev d75b556c56 Revert Clangd Dex Iterators patch
This reverts two revisions:

* https://reviews.llvm.org/rL338017
* https://reviews.llvm.org/rL338028

They caused crash for Clang 3.6 & Clang 3.7 buildbots, it was
reported by Jeremy Morse.

llvm-svn: 338054
2018-07-26 18:25:48 +00:00
Ilya Biryukov 74f2655dc7 [clangd] Fix (most) naming warnings from clang-tidy. NFC
llvm-svn: 338021
2018-07-26 12:05:31 +00:00
Kirill Bobyrev bea258d3d7 [clangd] Proof-of-concept query iterators for Dex symbol index
This patch introduces three essential types of query iterators:
`DocumentIterator`, `AndIterator`, `OrIterator`. It provides a
convenient API for query tree generation and serves as a building block
for the next generation symbol index - Dex. Currently, many
optimizations are missed to improve code readability and to serve as the
reference implementation. Potential improvements are briefly mentioned
in `FIXME`s and will be addressed in the following patches.

Dex RFC in the mailing list:
http://lists.llvm.org/pipermail/clangd-dev/2018-July/000022.html

Iterators, their applications and potential extensions are explained in
detail in the design proposal:
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit#heading=h.903u1zon9nkj

Reviewers: ioeric, sammccall, ilya-biryukov

Subscribers: cfe-commits, klimek, jfb, mgrang, mgorny, MaskRay, jkorous,
arphaman

Differential Revision: https://reviews.llvm.org/D49546

llvm-svn: 338017
2018-07-26 10:42:31 +00:00
Kirill Bobyrev 5e82f05e7a [clangd] Introduce Dex symbol index search tokens
This patch introduces the core building block of the next-generation
Clangd symbol index - Dex. Search tokens are the keys in the inverted
index and represent a characteristic of a specific symbol: examples of
search token types (Token Namespaces) are

* Trigrams -  these are essential for unqualified symbol name fuzzy
search * Scopes for filtering the symbols by the namespace * Paths, e.g.
these can be used to uprank symbols defined close to the edited file

This patch outlines the generic for such token namespaces, but only
implements trigram generation.

The intuition behind trigram generation algorithm is that each extracted
trigram is a valid sequence for Fuzzy Matcher jumps, proposed
implementation utilize existing FuzzyMatcher API for segmentation and
trigram extraction.

However, trigrams generation algorithm for the query string is different
from the previous one: it simply yields sequences of 3 consecutive
lowercased valid characters (letters, digits).

Dex RFC in the mailing list:
http://lists.llvm.org/pipermail/clangd-dev/2018-July/000022.html

The trigram generation techniques are described in detail in the
proposal:
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit#heading=h.903u1zon9nkj

Reviewers: sammccall, ioeric, ilya-biryukovA

Subscribers: cfe-commits, klimek, mgorny, MaskRay, jkorous, arphaman

Differential Revision: https://reviews.llvm.org/D49591

llvm-svn: 337901
2018-07-25 10:34:57 +00:00
Sam McCall bed5885d9e [clangd] Upgrade logging facilities with levels and formatv.
Summary:
log() is split into four functions:
 - elog()/log()/vlog() have different severity levels, allowing filtering
 - dlog() is a lazy macro which uses LLVM_DEBUG - it logs to the logger, but
   conditionally based on -debug-only flag and is omitted in release builds

All logging functions use formatv-style format strings now, e.g:
  log("Could not resolve URI {0}: {1}", URI, Result.takeError());

Existing log sites have been split between elog/log/vlog by best guess.

This includes a workaround for passing Error to formatv that can be
simplified when D49170 or similar lands.

Subscribers: ilya-biryukov, javed.absar, ioeric, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D49008

llvm-svn: 336785
2018-07-11 10:35:11 +00:00
Eric Liu a62c9d62a3 [clangd] Make sure macro information exists before increasing usage count.
llvm-svn: 336581
2018-07-09 18:54:51 +00:00
Eric Liu 48db19e95a [clangd] Support indexing MACROs.
Summary: This is not enabled in the global-symbol-builder or dynamic index yet.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D49028

llvm-svn: 336553
2018-07-09 15:31:07 +00:00
Sam McCall 4e5742a479 [clangd] Make SymbolOrigin an enum class, rather than a plain enum.
I never intended to define namespace pollution like clangd::AST, clangd::Unknown
etc. Oops!

llvm-svn: 336431
2018-07-06 11:50:49 +00:00
Sam McCall 2161ec7ee2 [clangd] Track origins of symbols (various indexes, Sema).
Summary: Surface it in the completion items C++ API, and when a flag is set.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48938

llvm-svn: 336309
2018-07-05 06:20:41 +00:00
Eric Liu a095770a82 [clangd] Always remove dots before converting paths to URIs in symbol collector.
llvm-svn: 335458
2018-06-25 11:50:11 +00:00
Sam McCall a68951e37e [clangd] More precise representation of symbol names/labels in the index.
Summary:
Previously, the strings matched LSP completion pretty closely.
The completion label was a single string, for instance. This made
implementing completion itself easy but makes it hard to use the names
in other way, e.g. pretty-printed name in synthesized
documentation/hover.

It also limits our introspection into completion items, which can only
be as precise as the indexed symbols. This change is a prerequisite to
improvements to overload bundling which need to inspect e.g. signature
structure.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48475

llvm-svn: 335360
2018-06-22 16:11:35 +00:00
Eric Liu 7ad1696900 [clangd] Expose qualified symbol names in CompletionItem (C++ structure only, no json).
Summary:
The qualified name can be used to match a completion item to its corresponding
symbol. This can be useful for tools that measure code completion quality.
Qualified names are not precise for identifying symbols; we need to figure out a
better way to identify completion items.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48425

llvm-svn: 335334
2018-06-22 10:46:59 +00:00
Sam McCall 032db94ac9 [clangd] Remove FilterText from the index.
Summary:
It's almost always identical to Name, and in fact we never used it (we used name
instead).
The only case where they differ is objc method selectors (foo: vs foo:bar:).
We can live with the latter for both name and filterText, so I've made that
change too.

Reviewers: ioeric

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48375

llvm-svn: 335321
2018-06-22 06:41:43 +00:00
Eric Liu 8763e48727 [clangd] Expose 'shouldCollectSymbol' helper from SymbolCollector.
Summary: This allows tools to examine symbols that would be collected in a symbol index. For example, a tool that measures index-based completion quality would be interested in references to symbols that are collected in the index.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48418

llvm-svn: 335218
2018-06-21 12:12:26 +00:00
Eric Liu 13e503f68a [clangd] Customizable URI schemes for dynamic index.
Summary:
This allows dynamic index to have consistent URI schemes with the
static index which can have customized URI schemes, which would make file
proximity scoring based on URIs easier.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D47931

llvm-svn: 334809
2018-06-15 08:55:00 +00:00
Eric Liu 6de95ece44 [clangd] Support proximity paths in index fuzzy find.
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D47937

llvm-svn: 334485
2018-06-12 08:48:20 +00:00