Commit Graph

45 Commits

Author SHA1 Message Date
Eric Liu 8763e48727 [clangd] Expose 'shouldCollectSymbol' helper from SymbolCollector.
Summary: This allows tools to examine symbols that would be collected in a symbol index. For example, a tool that measures index-based completion quality would be interested in references to symbols that are collected in the index.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D48418

llvm-svn: 335218
2018-06-21 12:12:26 +00:00
Eric Liu 13e503f68a [clangd] Customizable URI schemes for dynamic index.
Summary:
This allows dynamic index to have consistent URI schemes with the
static index which can have customized URI schemes, which would make file
proximity scoring based on URIs easier.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D47931

llvm-svn: 334809
2018-06-15 08:55:00 +00:00
Marc-Andre Laperle 945b5a3df0 [clangd] Add "member" symbols to the index
Summary:
This adds more symbols to the index:
- member variables and functions
- enum constants in scoped enums

The code completion behavior should remain intact but workspace symbols should
now provide much more useful symbols.
Other symbols should be considered such as the ones in "main files" (files not
being included) but this can be done separately as this introduces its fair
share of problems.

Signed-off-by: Marc-Andre Laperle <marc-andre.laperle@ericsson.com>

Reviewers: ioeric, sammccall

Reviewed By: ioeric, sammccall

Subscribers: hokein, sammccall, jkorous, klimek, ilya-biryukov, jkorous-apple, ioeric, MaskRay, cfe-commits

Differential Revision: https://reviews.llvm.org/D44954

llvm-svn: 334017
2018-06-05 14:01:40 +00:00
Eric Liu 77d1811e96 [clangd] Avoid indexing decls associated with friend decls.
Summary:
These decls are sometime used as the canonical declarations (e.g. for go-to-def),
which seems to be bad.

- friend decls that are not definitions should be ignored for indexing purposes
- this means they should never be selected as canonical decl
- if the friend decl is the only decl, then the symbol should not be indexed

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: mgorny, klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D47623

llvm-svn: 333885
2018-06-04 11:31:55 +00:00
Eric Liu 3cee95e723 [clangd] Skip .inc headers when canonicalizing header #include.
Summary:
This assumes that .inc files are supposed to be included via headers
that include them.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D47187

llvm-svn: 333188
2018-05-24 14:40:24 +00:00
Eric Liu cf377a1434 [clangd] Correctly handle IWYU prama with verbatim #include header.
llvm-svn: 332959
2018-05-22 08:33:30 +00:00
Ilya Biryukov 43714504a8 [clangd] Retrieve minimally formatted comment text in completion.
Summary:
Previous implementation used to extract brief text from doxygen comments.
Brief text parsing slows down completion and is not suited for
non-doxygen comments.

This commit switches to providing comments that mimic the ones
originally written in the source code, doing minimal reindenting and
removing the comments markers to make the output more user-friendly.

It means we lose support for doxygen-specific features, e.g. extracting
brief text, but provide useful results for non-doxygen comments.
Switching the doxygen support back is an option, but I suggest to see
whether the current approach gives more useful results.

Reviewers: sammccall, hokein, ioeric

Reviewed By: sammccall

Subscribers: klimek, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D45999

llvm-svn: 332459
2018-05-16 12:32:44 +00:00
Eric Liu d67ec24f3e [clangd] Filter out private proto symbols in SymbolCollector.
Summary:
This uses heuristics to identify private proto symbols. For example,
top-level symbols whose name contains "_" are considered private. These symbols
are not expected to be used by users.

Reviewers: ilya-biryukov, malaperle

Reviewed By: ilya-biryukov

Subscribers: sammccall, klimek, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D46751

llvm-svn: 332456
2018-05-16 12:12:30 +00:00
Haojian Wu c53400156b [clangd] Also use UTF-16 in index position.
Reviewers: sammccall

Subscribers: klimek, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits

Differential Revision: https://reviews.llvm.org/D46258

llvm-svn: 331168
2018-04-30 11:40:02 +00:00
Haojian Wu 545c02a710 [clangd] Add line and column number to the index symbol.
Summary:
LSP is using Line & column as symbol position, clangd needs to transfer file
offset to Line & column when sending results back to LSP client, which is a high
cost, especially for finding workspace symbol -- we have to read the file
content from disk (if it isn't loaded in memory).

Saving these information in the index will make the clangd life eaiser.

Reviewers: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, MaskRay, cfe-commits

Differential Revision: https://reviews.llvm.org/D45513

llvm-svn: 329997
2018-04-13 08:30:39 +00:00
Nico Weber 0da2290958 s/LLVM_ON_WIN32/_WIN32/, clang-tools-extra
LLVM_ON_WIN32 is set exactly with MSVC and MinGW (but not Cygwin) in
HandleLLVMOptions.cmake, which is where _WIN32 defined too.  Just use the
default macro instead of a reinvented one.

See thread "Replacing LLVM_ON_WIN32 with just _WIN32" on llvm-dev and cfe-dev.
No intended behavior change.

llvm-svn: 329695
2018-04-10 13:14:03 +00:00
Sam McCall b9d57118fb [clangd] Adapt index interfaces to D45014, and fix the old bugs.
Summary:
Fix bugs:
- don't count occurrences of decls where we don't spell the name
- findDefinitions at MACRO(^X) goes to the definition of MACRO

Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, MaskRay, cfe-commits

Differential Revision: https://reviews.llvm.org/D45356

llvm-svn: 329571
2018-04-09 14:28:52 +00:00
Sam McCall 93f99bf31f [clangd] Collect the number of files referencing a symbol in the static index.
Summary:
This is an important ranking signal.
It's off for the dynamic index for now. Correspondingly, tell the index
infrastructure only to report declarations for the dynamic index.

Reviewers: ioeric, hokein

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D44315

llvm-svn: 327275
2018-03-12 14:49:09 +00:00
Sam McCall 824913bdb7 [clangd] Don't index template specializations.
Summary:
These have different USRs than the underlying entity, but are not typically
interesting in their own right and can be numerous (e.g. generated traits).

Reviewers: ioeric

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D44298

llvm-svn: 327127
2018-03-09 13:25:29 +00:00
Eric Liu b96363da76 [clangd] Support include canonicalization in symbol leve.
Summary:
Symbols with different canonical includes might be defined in the same header
(e.g. symbols defined in STL <iosfwd>). This patch adds support for mapping from
qualified symbol names to canonical headers and special mapping for symbols in <iosfwd>

Reviewers: sammccall, hokein

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D43869

llvm-svn: 326456
2018-03-01 18:06:40 +00:00
Eric Liu cf8601b009 [clangd] Prefer the definition of a TagDecl (e.g. class) as CanonicalDeclaration.
Summary:
Currently, we pick the first declaration of a symbol in a TU, which is considered
canonical in the clangIndex, as the canonical declaration in clangd. This causes
forward declarations that might appear in a random header to be used as a
canonical declaration, which is not desirable for features like go-to-declaration
or include insertion.

For example, for class X, we would consider the forward declaration in fwd.h to
be the canonical declaration, while the preferred canonical declaration should
be the actual definition in x.h.
```
// fwd.h
class X;  // forward decl

// x.h
class X {};
```

This patch fixes the issue by making symbol collector favor the actual definition of
a TagDecl (i.e. class/struct/enum/union) found in a header file over the first seen
declarations in a TU. Other symbol types like functions are not handled because
using the first seen declarations as canonical declarations is usually a good
heuristic for them.

Reviewers: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D43823

llvm-svn: 326313
2018-02-28 09:33:15 +00:00
Roman Lebedev 12b40745ab Revert "[Tooling] [1/1] Refactor FrontendActionFactory::create() to return std::unique_ptr<>"
This reverts commit rL326202

This broke gcc4.8 builds, compiler just segfaults:
http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/14909
http://lab.llvm.org:8011/builders/clang-x86_64-linux-abi-test/builds/22673

llvm-svn: 326203
2018-02-27 15:54:41 +00:00
Roman Lebedev 6b56a11961 [Tooling] [1/1] Refactor FrontendActionFactory::create() to return std::unique_ptr<>
Summary:
I'm not sure whether there are any principal reasons why it returns raw owning pointer,
or it is just a old code that was not updated post-C++11.

I'm not too sure what testing i should do, because `check-all` is not error clean here for some reason,
but it does not //appear// asif those failures are related to these changes.

This is Clang-tools-extra part.
Clang part is D43779.

Reviewers: klimek, bkramer, alexfh, pcc

Reviewed By: alexfh

Subscribers: ioeric, jkorous-apple, cfe-commits

Tags: #clang, #clang-tools-extra

Differential Revision: https://reviews.llvm.org/D43780

llvm-svn: 326202
2018-02-27 15:19:28 +00:00
Eric Liu 02ce01f1e8 [clangd] Not collect include headers for dynamic index for now.
Summary:
The new behaviors introduced by this patch:
o When include collection is enabled, we always set IncludeHeader field in Symbol
even if it's the same as FileURI in decl.
o Disable include collection in FileIndex which is currently only used to build
dynamic index. We should revisit when we actually want to use FileIndex to global
index.
o Code-completion only uses IncludeHeader to insert headers but not FileURI in
CanonicalDeclaration. This ensures that inserted headers are always canonicalized.
Note that include insertion can still be triggered for symbols that are already
included if they are merged from dynamic index and static index, but we would
only use includes that are already canonicalized (e.g. from static index).

Reason for change:
Collecting header includes in dynamic index enables inserting includes for headers
that are not indexed but opened in the editor. Comparing to inserting includes for
symbols in global/static index, this is nice-to-have but would probably require
non-trivial amount of work to get right. For example:
o Currently it's not easy to fully support CanonicalIncludes in dynamic index, given the way
we run dynamic index.
o It's also harder to reason about the correctness of include canonicalization for dynamic index
(i.e. symbols in the current file/TU) than static index where symbols are collected
offline and sanity check is possible before shipping to production.
o We have less control/flexibility over symbol info in the dynamic index
(e.g. URIs, path normalization), which could be used to help make decision when inserting includes.

As header collection (especially canonicalization) is relatively new, and enabling
it for dynamic index would immediately affect current users with only dynamic
index support, I propose we disable it for dynamic index for now to avoid
compromising other hot features like code completion and only support it for
static index where include insertion would likely to bring more value.

Reviewers: ilya-biryukov, sammccall, hokein

Subscribers: klimek, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D43550

llvm-svn: 325764
2018-02-22 10:14:05 +00:00
Eric Liu c5105f9e3c [clangd] collect symbol #include & insert #include in global code completion.
Summary:
o Collect suitable #include paths for index symbols. This also does smart mapping
for STL symbols and IWYU pragma (code borrowed from include-fixer).
o For global code completion, add a command for inserting new #include in each code
completion item.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, mgorny, ilya-biryukov, jkorous-apple, hintonda, cfe-commits

Differential Revision: https://reviews.llvm.org/D42640

llvm-svn: 325343
2018-02-16 14:15:55 +00:00
Sam McCall c156806844 [clangd] TestFS cleanup: getVirtualBlahBlah -> testPath/testRoot. Remove SmallString micro-opt for more ergonomic tests. NFC
llvm-svn: 325326
2018-02-16 09:41:43 +00:00
Haojian Wu dc02a3d943 [clangd] SymbolLocation only covers symbol name.
Summary:
* Change the offset range to half-open, [start, end).
* Fix a few fixmes.

Reviewers: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D43182

llvm-svn: 324992
2018-02-13 09:53:50 +00:00
Sam McCall 6003951c66 [clangd] Collect definitions when indexing.
Within a TU:
 - as now, collect a declaration from the first occurrence of a symbol
   (taking clang's canonical declaration)
 - when we first see a definition occurrence, copy the symbol and add it
Across TUs/sources:
 - mergeSymbol in Merge.h is responsible for combining matching Symbols.
   This covers dynamic/static merges and cross-TU merges in the static index.
 - it prefers declarations from Symbols that have a definition.
 - GlobalSymbolBuilderMain is modified to use mergeSymbol as a reduce step.
Random cleanups (can be pulled out):
 - SymbolFromYAML -> SymbolsFromYAML, new singular SymbolFromYAML added
 - avoid uninit'd SymbolLocations. Add an idiomatic way to check "absent".
 - CanonicalDeclaration (as well as Definition) are mapped as optional in YAML.
 - added operator<< for Symbol & SymbolLocation, for debugging

Reviewers: ioeric, hokein

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42942

llvm-svn: 324735
2018-02-09 14:42:01 +00:00
Eric Liu 7f24765912 [clangd] Use URIs in index symbols.
Reviewers: hokein, sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42915

llvm-svn: 324358
2018-02-06 16:10:35 +00:00
Haojian Wu 3b8e00c5e0 [clangd] Fix incorrect file path for symbols defined by the compile command-line option.
Summary:

Reviewers: ioeric

Reviewed By: ioeric

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42913

llvm-svn: 324328
2018-02-06 09:50:35 +00:00
Sam McCall c40fdcdd7e [clangd] Fix ExternC test broken by r324081
llvm-svn: 324105
2018-02-02 17:01:36 +00:00
Eric Liu cf1773826f [clangd] Skip inline namespace when collecting scopes for index symbols.
Summary:
Some STL symbols are defined in inline namespaces. For example,
```
namespace std {
inline namespace __cxx11 {
 typedef ... string;
}
}
```
Currently, this will be `std::__cxx11::string`; however, `std::string` is desired.

Inline namespaces are treated as transparent scopes. This
reflects the way they're most commonly used for lookup. Ideally we'd
include them, but at query time it's hard to find all the inline
namespaces to query: the preamble doesn't have a dedicated list.

Reviewers: sammccall, hokein

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42796

llvm-svn: 324065
2018-02-02 10:31:42 +00:00
Haojian Wu a8d12bbc3f [clangd] remove the unused code NFC.
llvm-svn: 323960
2018-02-01 13:06:58 +00:00
Haojian Wu b018906051 [clangd] Better handling symbols defined in macros.
Summary:
For symbols defined inside macros:
 * use expansion location, if the symbol is formed via macro concatenation.
 * use spelling location, otherwise.

This will fix some symbols that have ill-format location (especial invalid filepath).

Reviewers: ioeric

Reviewed By: ioeric

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42575

llvm-svn: 323867
2018-01-31 12:56:51 +00:00
Eric Liu b57da098e9 [clangd] Fix r323658 test failure on windows.
llvm-svn: 323703
2018-01-29 22:28:08 +00:00
Eric Liu 278e2d1a26 [clangd] Add a fallback directory for collected symbols with relative paths.
Reviewers: hokein, sammccall

Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits

Differential Revision: https://reviews.llvm.org/D42638

llvm-svn: 323658
2018-01-29 15:13:29 +00:00
Sam McCall 8b2faeed02 [clangd] Change index scope convention from "outer::inner" to "outer::inner::"
Global scope is "" (was "")
Top-level namespace scope is "ns::" (was "ns")
Nested namespace scope is "ns::ns::" (was "ns::ns")

This composes more naturally:
- qname = scope + name
- full scope = resolved scope + unresolved scope (D42073 was the trigger)
It removes a wart from the old way: "foo::" has one more separator than "".

Another alternative that has these properties is "::ns", but that lacks
the property that both the scope and the name are substrings of the
qname as produced by clang.

This is re-landing r322996 which didn't build.

llvm-svn: 323000
2018-01-19 22:18:21 +00:00
Sam McCall 50584ea197 Revert "[clangd] Change index scope convention from "outer::inner" to "outer::inner::""
This reverts commit r322996.

llvm-svn: 322998
2018-01-19 22:09:34 +00:00
Sam McCall 1a8c55ecc6 [clangd] Change index scope convention from "outer::inner" to "outer::inner::"
Global scope is "" (was "")
Top-level namespace scope is "ns::" (was "ns")
Nested namespace scope is "ns::ns::" (was "ns::ns")

This composes more naturally:
 - qname = scope + name
 - full scope = resolved scope + unresolved scope (D42073 was the trigger)
It removes a wart from the old way: "foo::" has one more separator than "".

Another alternative that has these properties is "::ns", but that lacks
the property that both the scope and the name are substrings of the
qname as produced by clang.

llvm-svn: 322996
2018-01-19 21:58:58 +00:00
Haojian Wu 9873fdd4f5 [clangd] Collect enum constants in SymbolCollector
Summary:
* ignore nameless symbols
* include enum constant declarataion

Reviewers: ilya-biryukov, jkorous-apple

Reviewed By: ilya-biryukov

Subscribers: ioeric, cfe-commits, klimek

Differential Revision: https://reviews.llvm.org/D42074

llvm-svn: 322929
2018-01-19 09:35:55 +00:00
Eric Liu 9af958fbf8 [clangd] Add more filters for collected symbols.
Summary:
o We only collect symbols in namespace or translation unit scopes.
o Add an option to only collect symbols in included headers.

Reviewers: hokein, ilya-biryukov

Reviewed By: hokein, ilya-biryukov

Subscribers: klimek, cfe-commits

Differential Revision: https://reviews.llvm.org/D41823

llvm-svn: 322193
2018-01-10 14:57:58 +00:00
Eric Liu 76f6b44443 [clangd] Add more symbol information for code completion.
Reviewers: hokein, sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41345

llvm-svn: 322097
2018-01-09 17:32:00 +00:00
Eric Liu 5732c0e6ea [clangd] Use ToolExecutor to write the global-symbol-builder tool.
Summary:
This enables more execution modes like standalone and Mapreduce-style execution.

See also D41729

Reviewers: hokein, sammccall

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41730

llvm-svn: 322084
2018-01-09 15:21:45 +00:00
Haojian Wu fe22b7474b [clangd] Catch more symbols in SymbolCollector.
Summary:
We currently only collect external-linkage symbols in the collector,
which results in missing some typical symbols (like no-linkage type alias symbols).

This patch relaxes the constraint a bit to allow collecting more symbols.

Reviewers: ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: klimek, cfe-commits

Differential Revision: https://reviews.llvm.org/D41759

llvm-svn: 322067
2018-01-09 10:44:09 +00:00
Sam McCall 4b9bbb378b [clangd] Use Builder for symbol slabs, and use sorted-vector for storage
Summary:
This improves a few things:
 - the insert -> freeze -> read sequence is now enforced/communicated by the
   type system
 - SymbolSlab::const_iterator iterates over symbols, not over id-symbol pairs
 - we avoid permanently storing a second copy of the IDs, and the
   string map's hashtable

The slab size is now down to 21.8MB for the LLVM project.
Of this only 2.7MB is strings, the rest is #symbols * `sizeof(Symbol)`.
`sizeof(Symbol)` is currently 96, which seems too big - I think
SymbolInfo isn't efficiently packed. That's a topic for another patch!

Also added simple API to see the memory usage/#symbols of a slab, since
it seems likely we will continue to care about this.

Reviewers: ilya-biryukov

Subscribers: klimek, mgrang, cfe-commits

Differential Revision: https://reviews.llvm.org/D41506

llvm-svn: 321412
2017-12-23 19:38:03 +00:00
Sam McCall 6c0d0f5775 [clangd] Index symbols share storage within a slab.
Summary:
Symbols are not self-contained - it's only safe to hand them out if you
guarantee the lifetime of the underlying data.

Before this lands, I'm going to measure the before/after memory usage of the
LLVM index loaded into memory in a single slab.

Reviewers: hokein

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41483

llvm-svn: 321272
2017-12-21 14:58:44 +00:00
Eric Liu 4feda80a43 [clangd] Support filtering by fixing scopes in fuzzyFind.
Summary: When scopes are specified, only match symbols from scopes.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, ilya-biryukov, cfe-commits

Differential Revision: https://reviews.llvm.org/D41367

llvm-svn: 321067
2017-12-19 11:37:40 +00:00
Haojian Wu 56a5fca473 [clangd] Construct SymbolSlab from YAML format.
Summary: This will be used together with D40548 for the global index source (experimental).

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits, ioeric

Differential Revision: https://reviews.llvm.org/D41178

llvm-svn: 320694
2017-12-14 12:17:14 +00:00
Ilya Biryukov 5a85b8e6dd [clangd] clang-format the source code. NFC
llvm-svn: 320577
2017-12-13 12:53:16 +00:00
Haojian Wu 4c1394d67d [clangd] Introduce a "Symbol" class.
Summary:
* The "Symbol" class represents a C++ symbol in the codebase, containing all the
  information of a C++ symbol needed by clangd. clangd will use it in clangd's
  AST/dynamic index and global/static index (code completion and code
  navigation).
* The SymbolCollector (another IndexAction) will be used to recollect the
  symbols when the source file is changed (for ASTIndex), or to generate
  all C++ symbols for the whole project.

In the long term (when index-while-building is ready), clangd should share a
same "Symbol" structure and IndexAction with index-while-building, but
for now we want to have some stuff working in clangd.

Reviewers: ioeric, sammccall, ilya-biryukov, malaperle

Reviewed By: sammccall

Subscribers: malaperle, klimek, mgorny, cfe-commits

Differential Revision: https://reviews.llvm.org/D40897

llvm-svn: 320486
2017-12-12 15:42:10 +00:00