Commit Graph

34 Commits

Author SHA1 Message Date
Kirill Bobyrev 5e82f05e7a [clangd] Introduce Dex symbol index search tokens
This patch introduces the core building block of the next-generation
Clangd symbol index - Dex. Search tokens are the keys in the inverted
index and represent a characteristic of a specific symbol: examples of
search token types (Token Namespaces) are

* Trigrams -  these are essential for unqualified symbol name fuzzy
search * Scopes for filtering the symbols by the namespace * Paths, e.g.
these can be used to uprank symbols defined close to the edited file

This patch outlines the generic for such token namespaces, but only
implements trigram generation.

The intuition behind trigram generation algorithm is that each extracted
trigram is a valid sequence for Fuzzy Matcher jumps, proposed
implementation utilize existing FuzzyMatcher API for segmentation and
trigram extraction.

However, trigrams generation algorithm for the query string is different
from the previous one: it simply yields sequences of 3 consecutive
lowercased valid characters (letters, digits).

Dex RFC in the mailing list:
http://lists.llvm.org/pipermail/clangd-dev/2018-July/000022.html

The trigram generation techniques are described in detail in the
proposal:
https://docs.google.com/document/d/1C-A6PGT6TynyaX4PXyExNMiGmJ2jL1UwV91Kyx11gOI/edit#heading=h.903u1zon9nkj

Reviewers: sammccall, ioeric, ilya-biryukovA

Subscribers: cfe-commits, klimek, mgorny, MaskRay, jkorous, arphaman

Differential Revision: https://reviews.llvm.org/D49591

llvm-svn: 337901
2018-07-25 10:34:57 +00:00
Sam McCall d20d7989c6 [clangd] Remove JSON library in favor of llvm/Support/JSON
Summary:
The library has graduated from clangd to llvm/Support.
This is a mechanical change to move to the new API and remove the old one.

Main API changes:
 - namespace clang::clangd::json --> llvm::json
 - json::Expr --> json::Value
 - Expr::asString() etc --> Value::getAsString() etc
 - unsigned longs need a cast (due to r336541 adding lossless integer support)

Reviewers: ilya-biryukov

Subscribers: mgorny, ioeric, MaskRay, jkorous, omtcyfz, cfe-commits

Differential Revision: https://reviews.llvm.org/D49077

llvm-svn: 336549
2018-07-09 14:25:59 +00:00
Sam McCall 3f0243fdaf [clangd] Incorporate transitive #includes into code complete proximity scoring.
Summary:
We now compute a distance from the main file to the symbol header, which
is a weighted count of:
 - some number of #include traversals from source file --> included file
 - some number of FS traversals from file --> parent directory
 - some number of FS traversals from parent directory --> child file/dir
This calculation is performed in the appropriate URI scheme.

This means we'll get some proximity boost from header files in main-file
contexts, even when these are in different directory trees.

This extended file proximity model is not yet incorporated in the index
interface/implementation.

Reviewers: ioeric

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48441

llvm-svn: 336177
2018-07-03 08:09:29 +00:00
Roman Lebedev 9cb8e39183 [clang][tooling] Don't forget to link to clangToolingInclusions.
Fixes build with shared libs, broken by rL333874.
Some buildbot converage is sorely missing.

llvm-svn: 333891
2018-06-04 12:04:51 +00:00
Heejin Ahn 85e38ee18e [clangd] Fix a link failure in unittests
Summary: D46524 (rL332378) introduced a link failure when built with
`-DSHARED_LIB=ON`, which this patch fixes.

Reviewers: ioeric

Subscribers: klimek, mgorny, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D46906

llvm-svn: 332438
2018-05-16 08:53:57 +00:00
Sam McCall c5707b6c36 [clangd] Extract scoring/ranking logic, and shave yaks.
Summary:
Code completion scoring was embedded in CodeComplete.cpp, which is bad:
 - awkward to test. The mechanisms (extracting info from index/sema) can be
   unit-tested well, the policy (scoring) should be quantitatively measured.
   Neither was easily possible, and debugging was hard.
   The intermediate signal struct makes this easier.
 - hard to reuse. This is a bug in workspaceSymbols: it just presents the
   results in the index order, which is not sorted in practice, it needs to rank
   them!
   Also, index implementations care about scoring (both query-dependent and
   independent) in order to truncate result lists appropriately.

The main yak shaved here is the build() function that had 3 variants across
unit tests is unified in TestTU.h (rather than adding a 4th variant).

Reviewers: ilya-biryukov

Subscribers: klimek, mgorny, ioeric, MaskRay, jkorous, mgrang, cfe-commits

Differential Revision: https://reviews.llvm.org/D46524

llvm-svn: 332378
2018-05-15 17:43:27 +00:00
Marc-Andre Laperle b387b6e6dc [clangd] Implementation of workspace/symbol request
Summary:
This is a basic implementation of the "workspace/symbol" request which is
used to find symbols by a string query. Since this is similar to code completion
in terms of result, this implementation reuses the "fuzzyFind" in order to get
matches. For now, the scoring algorithm is the same as code completion and
improvements could be done in the future.

The index model doesn't contain quite enough symbols for this to cover
common symbols like methods, enum class enumerators, functions in unamed
namespaces, etc. The index model will be augmented separately to achieve this.

Reviewers: sammccall, ilya-biryukov

Reviewed By: sammccall

Subscribers: jkorous, hokein, simark, sammccall, klimek, mgorny, ilya-biryukov, mgrang, jkorous-apple, ioeric, MaskRay, cfe-commits

Differential Revision: https://reviews.llvm.org/D44882

llvm-svn: 330637
2018-04-23 20:00:52 +00:00
Sam McCall 690dcf12c2 Parse .h files as objective-c++ if we don't have a compile command.
Summary: This makes C++/objC not totally broken, without hurting C files too much.

Reviewers: ilya-biryukov

Subscribers: klimek, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D45442

llvm-svn: 330418
2018-04-20 11:35:17 +00:00
Simon Marchi 9808262ede [clangd] Support incremental document syncing
Summary:
This patch adds support for incremental document syncing, as described
in the LSP spec.  The protocol specifies ranges in terms of Position (a
line and a character), and our drafts are stored as plain strings.  So I
see two things that may not be super efficient for very large files:

- Converting a Position to an offset (the positionToOffset function)
  requires searching for end of lines until we reach the desired line.
- When we update a range, we construct a new string, which implies
  copying the whole document.

However, for the typical size of a C++ document and the frequency of
update (at which a user types), it may not be an issue.  This patch aims
at getting the basic feature in, and we can always improve it later if
we find it's too slow.

Signed-off-by: Simon Marchi <simon.marchi@ericsson.com>

Reviewers: malaperle, ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: MaskRay, klimek, mgorny, ilya-biryukov, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D44272

llvm-svn: 328500
2018-03-26 14:41:40 +00:00
Simon Marchi 766338ad7f Make positionToOffset return llvm::Expected<size_t>
Summary:

To implement incremental document syncing, we want to verify that the
ranges provided by the front-end are valid.  Currently, positionToOffset
deals with invalid Positions by returning 0 or Code.size(), which are
two valid offsets.  Instead, return an llvm:Expected<size_t> with an
error if the position is invalid.

According to the LSP, if the character value exceeds the number of
characters of the given line, it should default back to the end of the
line.  It makes sense in some contexts to have this behavior, and does
not in other contexts.  The AllowColumnsBeyondLineLength parameter
allows to decide what to do in that case, default back to the end of the
line, or return an error.

Reviewers: ilya-biryukov

Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D44673

llvm-svn: 328100
2018-03-21 14:36:46 +00:00
Heejin Ahn f4a29252fb [clangd] Fix link failures for Preprocessor::addCommentHandler
Summary:
D42640 adds calls to `Preprocessor::addCommentHandler` in
`unittests/clangd/SymbolCollectorTests.cpp` and
`clangd/global-symbol-builder/GlobalSymbolBuilderMain.cpp` but does not
link `clangLex` library. This causes undefined reference errors when
built with `-DBUILD_SHARED_LIBS=ON`.

Reviewers: ioeric

Subscribers: klimek, mgorny, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D43437

llvm-svn: 325458
2018-02-18 10:50:16 +00:00
Eric Liu c5105f9e3c [clangd] collect symbol #include & insert #include in global code completion.
Summary:
o Collect suitable #include paths for index symbols. This also does smart mapping
for STL symbols and IWYU pragma (code borrowed from include-fixer).
o For global code completion, add a command for inserting new #include in each code
completion item.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, mgorny, ilya-biryukov, jkorous-apple, hintonda, cfe-commits

Differential Revision: https://reviews.llvm.org/D42640

llvm-svn: 325343
2018-02-16 14:15:55 +00:00
Ilya Biryukov cd5eb00e8b [clangd] Remove codeComplete that returns std::future<>
Summary:
It was deprecated and callback version and is used everywhere.
Only changes to the testing code were needed.

Reviewers: hokein, ioeric, sammccall

Reviewed By: sammccall

Subscribers: mgorny, klimek, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D43068

llvm-svn: 324883
2018-02-12 11:37:28 +00:00
Ilya Biryukov 7e5ee26d1a Resubmit "[clangd] The new threading implementation"
Initially submitted as r324356 and reverted in r324386.

This change additionally contains a fix to crashes of the buildbots.
The source of the crash was undefined behaviour caused by
std::future<> whose std::promise<> was destroyed without calling
set_value().

llvm-svn: 324575
2018-02-08 07:37:35 +00:00
Ilya Biryukov 3693f5941a Revert "[clangd] The new threading implementation" (r324356)
And the follow-up changes r324361 and r324363.
These changes seem to break two buildbots:
  - http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/14091
  - http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/16001

We will need to investigate what went wrong and resubmit the changes
afterwards.

llvm-svn: 324386
2018-02-06 19:22:40 +00:00
Ilya Biryukov cce8883094 [clangd] The new threading implementation
Summary:
In the new threading model clangd creates one thread per file to manage
the AST and one thread to process each of the incoming requests.
The number of actively running threads is bounded by the semaphore to
avoid overloading the system.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, mgorny, jkorous-apple, ioeric, hintonda, cfe-commits

Differential Revision: https://reviews.llvm.org/D42573

llvm-svn: 324356
2018-02-06 15:53:42 +00:00
Ilya Biryukov 75f1dd9b98 [clangd] Refactored threading in ClangdServer
Summary:
We now provide an abstraction of Scheduler that abstracts threading
and resource management in ClangdServer.
No changes to behavior are intended with an exception of changed error
messages.
This patch is preliminary work to allow a revamped threading
implementation that will move the threading code out of CppFile.

Reviewers: sammccall, bkramer, jkorous-apple

Reviewed By: sammccall

Subscribers: hokein, mgorny, hintonda, ioeric, jkorous-apple, cfe-commits, klimek

Differential Revision: https://reviews.llvm.org/D42174

llvm-svn: 323851
2018-01-31 08:51:16 +00:00
Sam McCall 034e11aca5 [clangd] Add ClangdUnit diagnostics tests using annotated code.
Summary:
This adds checks that our diagnostics emit correct ranges in a bunch of cases,
as promised in D41118.

The diagnostics-preamble test is also converted and extended to be a little more
precise.

diagnostics.test stays around as the smoke test for this feature.

Reviewers: ilya-biryukov

Subscribers: klimek, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D41454

llvm-svn: 323448
2018-01-25 17:29:17 +00:00
Eric Liu f5b8c82198 [clangd] Add support for different file URI schemas.
Summary: I will replace the existing URI struct in Protocol.h with the new URI and rename FileURI to URI in a followup patch.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: jkorous-apple, klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41946

llvm-svn: 323101
2018-01-22 11:48:20 +00:00
Eric Liu 63696e14e3 [clangd] Pull CodeCompletionString handling logic into its own file and add unit test.
Reviewers: sammccall

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41450

llvm-svn: 321193
2017-12-20 17:24:31 +00:00
Sam McCall 328cbdb9e4 [clangd] Switch xrefs and documenthighlight to annotated-code unit tests. NFC
Summary:
The goal here is again to make it easier to read and write the tests.

I've extracted `parseTextMarker` from CodeCompleteTests into an `Annotations`
class, adding features to it:
  - as well as points `^s` it allows ranges `[[...]]`
  - multiple points and ranges are supported
  - points and ranges may be named: `$name^` and `$name[[...]]`

These features are used for the xrefs tests. This also paves the way for
replacing the lit diagnostics.test with more readable unit tests, using named
ranges.

Alternative considered: `TestSelectionRange` in clang-refactor/TestSupport
Main problems were:
 - delimiting the end of ranges is awkward, requiring counting
 - comment syntax is long and at least as cryptic for most cases
 - no separate syntax for point vs range, which keeps xrefs tests concise
 - Still need to convert to Position everywhere
 - Still need helpers for common case of expecting exactly one point/range

(I'll probably promote the extra `PrintTo`s from some of the core Protocol types
into `operator<<` in `Protocol.h` itself in a separate, prior patch...)

Reviewers: ioeric

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41432

llvm-svn: 321184
2017-12-20 16:06:05 +00:00
Sam McCall b536a2a5ba [clangd] Expose offset <-> LSP position functions, and fix bugs
Summary:
- Moved these functions to SourceCode.h
- added unit tests
- fix off by one in positionToOffset: Offset - 1 in final calculation was wrong
- fixed formatOnType which had an equal and opposite off-by-one
- positionToOffset and offsetToPosition both consistently clamp to beginning/end
  of file when input is out of range
- gave variables more descriptive names
- removed windows line ending fixmes where there is nothing to fix
- elaborated on UTF-8 fixmes

This will conflict with Eric's D41281, but in a pretty easy-to-resolve way.

Reviewers: ioeric

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41351

llvm-svn: 321073
2017-12-19 12:23:48 +00:00
Eric Liu eea1633878 [clangd] Build in-memory index on symbols in files.
Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41276

llvm-svn: 320807
2017-12-15 12:25:02 +00:00
Eric Liu d293bf127a [clangd] Add a FileSymbols container that manages symbols from multiple files.
Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41232

llvm-svn: 320701
2017-12-14 14:50:58 +00:00
Eric Liu 3732cadc73 [clangd] Symbol index interfaces and an in-memory index implementation.
Summary:
o Index interfaces to support using different index sources (e.g. AST index, global index) for code completion, cross-reference finding etc. This patch focuses on code completion.

The following changes in the original patch has been split out.
o Implement an AST-based index.
o Add an option to replace sema code completion for qualified-id with index-based completion.
o Implement an initial naive code completion index which matches symbols that have the query string as substring.

Reviewers: malaperle, sammccall

Reviewed By: sammccall

Subscribers: hokein, klimek, malaperle, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D40548

llvm-svn: 320688
2017-12-14 11:25:49 +00:00
Haojian Wu 03e2bd76c8 [clangd] Fix the unitttest build error on buildbot.
llvm-svn: 320678
2017-12-14 09:20:21 +00:00
Haojian Wu 4c1394d67d [clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
  information of a C++ symbol needed by clangd. clangd will use it in clangd's
  AST/dynamic index and global/static index (code completion and code
  navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
  symbols when the source file is changed (for ASTIndex), or to generate
  all C++ symbols for the whole project.

In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.

Reviewers: ioeric, sammccall, ilya-biryukov, malaperle

Reviewed By: sammccall

Subscribers: malaperle, klimek, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D40897

llvm-svn: 320486
2017-12-12 15:42:10 +00:00
Ilya Biryukov 657159c273 [clangd] Introduced a Context that stores implicit data
Summary:
It will be used to pass around things like Logger and Tracer throughout
clangd classes.

Reviewers: sammccall, ioeric, hokein, bkramer

Reviewed By: sammccall

Subscribers: klimek, bkramer, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D40485

llvm-svn: 320468
2017-12-12 11:16:45 +00:00
Shoaib Meenai d806af3499 [CMake] Use PRIVATE in target_link_libraries for executables
We currently use target_link_libraries without an explicit scope
specifier (INTERFACE, PRIVATE or PUBLIC) when linking executables.
Dependencies added in this way apply to both the target and its
dependencies, i.e. they become part of the executable's link interface
and are transitive.

Transitive dependencies generally don't make sense for executables,
since you wouldn't normally be linking against an executable. This also
causes issues for generating install export files when using
LLVM_DISTRIBUTION_COMPONENTS. For example, clang has a lot of LLVM
library dependencies, which are currently added as interface
dependencies. If clang is in the distribution components but the LLVM
libraries it depends on aren't (which is a perfectly legitimate use case
if the LLVM libraries are being built static and there are therefore no
run-time dependencies on them), CMake will complain about the LLVM
libraries not being in export set when attempting to generate the
install export file for clang. This is reasonable behavior on CMake's
part, and the right thing is for LLVM's build system to explicitly use
PRIVATE dependencies for executables.

Unfortunately, CMake doesn't allow you to mix and match the keyword and
non-keyword target_link_libraries signatures for a single target; i.e.,
if a single call to target_link_libraries for a particular target uses
one of the INTERFACE, PRIVATE, or PUBLIC keywords, all other calls must
also be updated to use those keywords. This means we must do this change
in a single shot. I also fully expect to have missed some instances; I
tested by enabling all the projects in the monorepo (except dragonegg),
and configuring both with and without shared libraries, on both Darwin
and Linux, but I'm planning to rely on the buildbots for other
configurations (since it should be pretty easy to fix those).

Even after this change, we still have a lot of target_link_libraries
calls that don't specify a scope keyword, mostly for shared libraries.
I'm thinking about addressing those in a follow-up, but that's a
separate change IMO.

Differential Revision: https://reviews.llvm.org/D40823

llvm-svn: 319840
2017-12-05 21:49:56 +00:00
Sam McCall 9aad25f193 [clangd] Split code-completion tests out of ClangdTests. NFC.
Summary:
Common parts are mostly FS related, so pulled out TestFS.h for the common stuff.
Deliberately resisted cleaning up much here, so this is pretty mechanical.

Reviewers: hokein

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D40784

llvm-svn: 319741
2017-12-05 07:20:26 +00:00
Sam McCall 87496417ff [clangd] Fuzzy match scorer
Summary:
This will be used for rescoring code completion results based on partial
identifiers.
Short-term use:
  - we want to limit the number of code completion results returned to
  improve performance of global completion. The scorer will be used to
  rerank the results to return when the user has applied a filter.
Long-term use case:
  - ranking of completion results from in-memory index
  - merging of completion results from multiple sources (merging usually
  works best when done at the component-score level, rescoring the
  fuzzy-match quality avoids different backends needing to have
  comparable scores)

Reviewers: ilya-biryukov

Subscribers: cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D40060

llvm-svn: 319557
2017-12-01 17:08:02 +00:00
Sam McCall dd0566bb2c Adds a json::Expr type to represent intermediate JSON expressions.
Summary:
This form can be created with a nice clang-format-friendly literal syntax,
and gets escaping right. It knows how to call unparse() on our Protocol types.
All the places where we pass around JSON internally now use this type.

Object properties are sorted (stored as std::map) and so serialization is
canonicalized, with optional prettyprinting (triggered by a -pretty flag).
This makes the lit tests much nicer to read and somewhat nicer to debug.
(Unfortunately the completion tests use CHECK-DAG, which only has
line-granularity, so pretty-printing is disabled there. In future we
could make completion ordering deterministic, or switch to unittests).

Compared to the current approach, it has some efficiencies like avoiding copies
of string literals used as object keys, but is probably slower overall.
I think the code/test quality benefits are worth it.

This patch doesn't attempt to do anything about JSON *parsing*.
It takes direction from the proposal in this doc[1], but is limited in scope
and visibility, for now.
I am of half a mind just to use Expr as the target of a parser, and maybe do a
little string deduplication, but not bother with clever memory allocation.
That would be simple, and fast enough for clangd...
[1] https://docs.google.com/document/d/1OEF9IauWwNuSigZzvvbjc1cVS1uGHRyGTXaoy3DjqM4/edit

+cc d0k so he can tell me not to use std::map.

Reviewers: ioeric, malaperle

Subscribers: bkramer, ilya-biryukov, mgorny, klimek

Differential Revision: https://reviews.llvm.org/D39435

llvm-svn: 317486
2017-11-06 15:40:30 +00:00
Sam McCall 8567cb3720 Performance tracing facility for clangd.
Summary:
This lets you visualize clangd's activity on different threads over time,
and understand critical paths of requests and object lifetimes.
The data produced can be visualized in Chrome (at chrome://tracing), or
in a standalone copy of catapult (http://github.com/catapult-project/catapult)

This patch consists of:
 - a command line flag "-trace" that causes clangd to emit JSON trace data
 - an API (in Trace.h) allowing clangd code to easily add events to the stream
 - several initial uses of this API to capture JSON-RPC requests, builds, logs

Example result: https://photos.app.goo.gl/12L9swaz5REGQ1rm1

Caveats:
 - JSON serialization is ad-hoc (isn't it everywhere?) so the API is
   limited to naming events rather than attaching arbitrary metadata.
   I'd like to fix this (I think we could use a JSON-object abstraction).
 - The recording is very naive: events are written immediately by
   locking a mutex. Contention on the mutex might disturb performance.
 - For now it just traces instants or spans on the current thread.
   There are other things that make sense to show (cross-thread flows,
   non-thread resources such as ASTs). But we have to start somewhere.

Reviewers: ioeric, ilya-biryukov

Subscribers: cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D39086

llvm-svn: 317193
2017-11-02 09:21:51 +00:00
Ilya Biryukov 0f62ed2bbe [clangd] Allow to use vfs::FileSystem for file accesses.
Summary:
Custom vfs::FileSystem is currently used for unit tests.
This revision depends on https://reviews.llvm.org/D33397.

Reviewers: bkramer, krasimir

Reviewed By: bkramer, krasimir

Subscribers: klimek, cfe-commits, mgorny

Differential Revision: https://reviews.llvm.org/D33416

llvm-svn: 303977
2017-05-26 12:26:51 +00:00