Summary:
Potential use case: argument go-to-definition result with symbol
information (e.g. function definition in cc file) that might not be in the AST.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D44305
llvm-svn: 327487
Summary:
This is an important ranking signal.
It's off for the dynamic index for now. Correspondingly, tell the index
infrastructure only to report declarations for the dynamic index.
Reviewers: ioeric, hokein
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D44315
llvm-svn: 327275
Summary: This also matches the range in symbol index.
Reviewers: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44247
llvm-svn: 327129
Summary:
These have different USRs than the underlying entity, but are not typically
interesting in their own right and can be numerous (e.g. generated traits).
Reviewers: ioeric
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D44298
llvm-svn: 327127
Summary:
Symbols with different canonical includes might be defined in the same header
(e.g. symbols defined in STL <iosfwd>). This patch adds support for mapping from
qualified symbol names to canonical headers and special mapping for symbols in <iosfwd>
Reviewers: sammccall, hokein
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43869
llvm-svn: 326456
Summary:
Currently, we pick the first declaration of a symbol in a TU, which is considered
canonical in the clangIndex, as the canonical declaration in clangd. This causes
forward declarations that might appear in a random header to be used as a
canonical declaration, which is not desirable for features like go-to-declaration
or include insertion.
For example, for class X, we would consider the forward declaration in fwd.h to
be the canonical declaration, while the preferred canonical declaration should
be the actual definition in x.h.
```
// fwd.h
class X; // forward decl
// x.h
class X {};
```
This patch fixes the issue by making symbol collector favor the actual definition of
a TagDecl (i.e. class/struct/enum/union) found in a header file over the first seen
declarations in a TU. Other symbol types like functions are not handled because
using the first seen declarations as canonical declarations is usually a good
heuristic for them.
Reviewers: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43823
llvm-svn: 326313
Summary:
The new behaviors introduced by this patch:
o When include collection is enabled, we always set IncludeHeader field in Symbol
even if it's the same as FileURI in decl.
o Disable include collection in FileIndex which is currently only used to build
dynamic index. We should revisit when we actually want to use FileIndex to global
index.
o Code-completion only uses IncludeHeader to insert headers but not FileURI in
CanonicalDeclaration. This ensures that inserted headers are always canonicalized.
Note that include insertion can still be triggered for symbols that are already
included if they are merged from dynamic index and static index, but we would
only use includes that are already canonicalized (e.g. from static index).
Reason for change:
Collecting header includes in dynamic index enables inserting includes for headers
that are not indexed but opened in the editor. Comparing to inserting includes for
symbols in global/static index, this is nice-to-have but would probably require
non-trivial amount of work to get right. For example:
o Currently it's not easy to fully support CanonicalIncludes in dynamic index, given the way
we run dynamic index.
o It's also harder to reason about the correctness of include canonicalization for dynamic index
(i.e. symbols in the current file/TU) than static index where symbols are collected
offline and sanity check is possible before shipping to production.
o We have less control/flexibility over symbol info in the dynamic index
(e.g. URIs, path normalization), which could be used to help make decision when inserting includes.
As header collection (especially canonicalization) is relatively new, and enabling
it for dynamic index would immediately affect current users with only dynamic
index support, I propose we disable it for dynamic index for now to avoid
compromising other hot features like code completion and only support it for
static index where include insertion would likely to bring more value.
Reviewers: ilya-biryukov, sammccall, hokein
Subscribers: klimek, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43550
llvm-svn: 325764
Summary:
o Avoid inserting a header include into the header itself.
o Avoid inserting non-header files (by not indexing symbols in main
files at all).
o Canonicalize include paths for symbols in dynamic index.
Reviewers: sammccall, ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: klimek, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43462
llvm-svn: 325523
Summary:
There are a few implementation options here - alternatives are either both
awkward and inefficient, or really inefficient.
This is at least potentially a hot path when used as a reducer for common
symbols.
(Also fix an unused-var that sneaked in)
Reviewers: ioeric
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D43381
llvm-svn: 325476
Summary:
o Collect suitable #include paths for index symbols. This also does smart mapping
for STL symbols and IWYU pragma (code borrowed from include-fixer).
o For global code completion, add a command for inserting new #include in each code
completion item.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, mgorny, ilya-biryukov, jkorous-apple, hintonda, cfe-commits
Differential Revision: https://reviews.llvm.org/D42640
llvm-svn: 325343
Within a TU:
- as now, collect a declaration from the first occurrence of a symbol
(taking clang's canonical declaration)
- when we first see a definition occurrence, copy the symbol and add it
Across TUs/sources:
- mergeSymbol in Merge.h is responsible for combining matching Symbols.
This covers dynamic/static merges and cross-TU merges in the static index.
- it prefers declarations from Symbols that have a definition.
- GlobalSymbolBuilderMain is modified to use mergeSymbol as a reduce step.
Random cleanups (can be pulled out):
- SymbolFromYAML -> SymbolsFromYAML, new singular SymbolFromYAML added
- avoid uninit'd SymbolLocations. Add an idiomatic way to check "absent".
- CanonicalDeclaration (as well as Definition) are mapped as optional in YAML.
- added operator<< for Symbol & SymbolLocation, for debugging
Reviewers: ioeric, hokein
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D42942
llvm-svn: 324735
Summary:
Some STL symbols are defined in inline namespaces. For example,
```
namespace std {
inline namespace __cxx11 {
typedef ... string;
}
}
```
Currently, this will be `std::__cxx11::string`; however, `std::string` is desired.
Inline namespaces are treated as transparent scopes. This
reflects the way they're most commonly used for lookup. Ideally we'd
include them, but at query time it's hard to find all the inline
namespaces to query: the preamble doesn't have a dedicated list.
Reviewers: sammccall, hokein
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D42796
llvm-svn: 324065
Summary:
Instead of passing Context explicitly around, we now have a thread-local
Context object `Context::current()` which is an implicit argument to
every function.
Most manipulation of this should use the WithContextValue helper, which
augments the current Context to add a single KV pair, and restores the
old context on destruction.
Advantages are:
- less boilerplate in functions that just propagate contexts
- reading most code doesn't require understanding context at all, and
using context as values in fewer places still
- fewer options to pass the "wrong" context when it changes within a
scope (e.g. when using Span)
- contexts pass through interfaces we can't modify, such as VFS
- propagating contexts across threads was slightly tricky (e.g.
copy vs move, no move-init in lambdas), and is now encapsulated in
the threadpool
Disadvantages are all the usual TLS stuff - hidden magic, and
potential for higher memory usage on threads that don't use the
context. (In practice, it's just one pointer)
Reviewers: ilya-biryukov
Subscribers: klimek, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D42517
llvm-svn: 323872
Summary:
For symbols defined inside macros:
* use expansion location, if the symbol is formed via macro concatenation.
* use spelling location, otherwise.
This will fix some symbols that have ill-format location (especial invalid filepath).
Reviewers: ioeric
Reviewed By: ioeric
Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits
Differential Revision: https://reviews.llvm.org/D42575
llvm-svn: 323867
Summary:
* truncate symbols from static/dynamic index to the limited number
(which would save lots of cost in constructing the merged symbols).
* add an CLI option allowing to limit the number of returned completion results.
(default to 100)
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D42484
llvm-svn: 323408
Global scope is "" (was "")
Top-level namespace scope is "ns::" (was "ns")
Nested namespace scope is "ns::ns::" (was "ns::ns")
This composes more naturally:
- qname = scope + name
- full scope = resolved scope + unresolved scope (D42073 was the trigger)
It removes a wart from the old way: "foo::" has one more separator than "".
Another alternative that has these properties is "::ns", but that lacks
the property that both the scope and the name are substrings of the
qname as produced by clang.
This is re-landing r322996 which didn't build.
llvm-svn: 323000
Global scope is "" (was "")
Top-level namespace scope is "ns::" (was "ns")
Nested namespace scope is "ns::ns::" (was "ns::ns")
This composes more naturally:
- qname = scope + name
- full scope = resolved scope + unresolved scope (D42073 was the trigger)
It removes a wart from the old way: "foo::" has one more separator than "".
Another alternative that has these properties is "::ns", but that lacks
the property that both the scope and the name are substrings of the
qname as produced by clang.
llvm-svn: 322996
Summary:
This makes performance slower but more predictable (it always processes
every symbol). We need to find ways to make this fast, possibly by precomputing
short queries or capping the number of scored results. But our current approach
is too naive.
It also no longer returns results in a "good" order. In fact it's pathological:
the top N results are ranked from worst to best. Indexes aren't responsible for
ranking and MergedIndex can't do a good job, so I'm pleased that this will make
any hidden assumptions we have more noticeable :-)
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D42060
llvm-svn: 322821
Summary:
This would have caught a bug I wrote in an early version of D42049, where
an index user could overwrite data internal to the index because the Symbol is
not deep-const.
The YAML traits are now a bit more verbose, but separate concerns a bit more
nicely: ArenaPtr can be reused for other similarly-allocated objects, including
scalars etc.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, cfe-commits, ioeric
Differential Revision: https://reviews.llvm.org/D42059
llvm-svn: 322509
Summary:
We now hide the static/dynamic split from the code completion, behind a
new implementation of the SymbolIndex interface. This will reduce the
complexity of the sema/index merging that needs to be done by
CodeComplete, at a fairly small cost in flexibility.
Reviewers: hokein
Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D42049
llvm-svn: 322480
Summary:
To stay fast, it avoids deserializing anything outside the current file, by
disabling the LoadExternal code completion option added in r322377, when the
index is enabled.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41996
llvm-svn: 322387
Summary:
o We only collect symbols in namespace or translation unit scopes.
o Add an option to only collect symbols in included headers.
Reviewers: hokein, ilya-biryukov
Reviewed By: hokein, ilya-biryukov
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D41823
llvm-svn: 322193
Summary:
Use the YAML-format symbols (generated by the global-symbol-builder tool) to
do the global code completion.
It is **experimental** only , but it allows us to experience global code
completion on a relatively small project.
Tested with LLVM project.
Reviewers: sammccall, ioeric
Reviewed By: sammccall, ioeric
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41668
llvm-svn: 322191
Summary:
This enables more execution modes like standalone and Mapreduce-style execution.
See also D41729
Reviewers: hokein, sammccall
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41730
llvm-svn: 322084
Summary:
We currently only collect external-linkage symbols in the collector,
which results in missing some typical symbols (like no-linkage type alias symbols).
This patch relaxes the constraint a bit to allow collecting more symbols.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D41759
llvm-svn: 322067
Summary:
This improves a few things:
- the insert -> freeze -> read sequence is now enforced/communicated by the
type system
- SymbolSlab::const_iterator iterates over symbols, not over id-symbol pairs
- we avoid permanently storing a second copy of the IDs, and the
string map's hashtable
The slab size is now down to 21.8MB for the LLVM project.
Of this only 2.7MB is strings, the rest is #symbols * `sizeof(Symbol)`.
`sizeof(Symbol)` is currently 96, which seems too big - I think
SymbolInfo isn't efficiently packed. That's a topic for another patch!
Also added simple API to see the memory usage/#symbols of a slab, since
it seems likely we will continue to care about this.
Reviewers: ilya-biryukov
Subscribers: klimek, mgrang, cfe-commits
Differential Revision: https://reviews.llvm.org/D41506
llvm-svn: 321412
Summary:
Symbols are not self-contained - it's only safe to hand them out if you
guarantee the lifetime of the underlying data.
Before this lands, I'm going to measure the before/after memory usage of the
LLVM index loaded into memory in a single slab.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41483
llvm-svn: 321272