Previously we did not record local class declarations. Now with features like
findImplementation and typeHierarchy, we have a need to index such local
classes to accurately report subclasses and implementations of methods.
Performance testing results:
- No changes in indexing timing.
- No significant change in memory usage.
- **1%** increase in #relations.
- **0.17%** increase in #refs.
- **0.22%** increase #symbols.
**New index stats**
Time to index: **4:13 min**
memory usage **543MB**
number of symbols: **521.5K**
number of refs: **8679K**
number of relations: **49K**
**Base Index stats**
Time to index: **4:15 min**
memory usage **542MB**
number of symbols: **520K**
number of refs: **8664K**
number of relations: **48.5K**
Fixes: https://github.com/clangd/clangd/issues/644
Differential Revision: https://reviews.llvm.org/D94785
Previously a corrupted index shard could cause us to resize arrays to an
arbitrary int32. This tends to be a huge number, and can render the
system unresponsive.
Instead, cap this at the amount of data that might reasonably be read
(e.g. the #bytes in the file). If the specified length is more than that,
assume the data is corrupt.
Differential Revision: https://reviews.llvm.org/D91258
https://reviews.llvm.org/D89670 changed the Ref structure, we need to
bump the version to invalidate all stored stale data, otherwise we will
get ` Error while reading shard: malformed or truncated refs` when
building the background index.
Differential Revision: https://reviews.llvm.org/D91131
Without this patch `clangd` crashes at try to load compressed string table when `zlib` is not available.
Example:
- Build `clangd` with MinGW (`zlib` found)
- Build index
- Build `clangd` with Visual Studio compiler (`zlib` not found)
- Try to load index
Reviewed By: sammccall, adamcz
Differential Revision: https://reviews.llvm.org/D87673
Summary:
This is considerably terser than the makeStringError and friends, and
avoids verbosity cliffs that discourage adding log information.
It follows the syntax used in log/elog/vlog/dlog that have been successful.
The main caveats are:
- it's strictly out-of-place in logger.h, though kind of fits thematically and
in implementation
- it claims the "error" identifier, which seems a bit too opinionated
to put higher up in llvm
I've updated some users of StringError mostly at random - there are lots
more mechanical changes but I'd like to get this reviewed before making
them all.
Reviewers: kbobyrev, hokein
Subscribers: mgorny, ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D83419
Summary:
Though we don't have new changes to the index format, we have changes to
symbol collector, e.g. collect marcos, spelled references. Bump the
version to force background-index to rebuild.
Reviewers: kadircet
Reviewed By: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74127
This is how it should've been and brings it more in line with
std::string_view. There should be no functional change here.
This is mostly mechanical from a custom clang-tidy check, with a lot of
manual fixups. It uncovers a lot of minor inefficiencies.
This doesn't actually modify StringRef yet, I'll do that in a follow-up.
This simplifies various workflows, particularly in debugging/development.
e.g. editors will tend to propagate flags, so you can run
`env CLANGD_FLAGS=-input-mirror-file=/tmp/mirror vim foo.cc` rather than
change the configuration in a persistent way.
(This also gives us a generic lever when we don't know how to customize
the flags in some particular LSP client).
While here, add a test for this and other startup logging, and fix a
couple of direct writes to errs() that should have been logs.
Differential Revision: https://reviews.llvm.org/D65153
llvm-svn: 366991
Summary:
Currently SHA1 is about 10% of our CPU, this patch reduces it to ~1%.
xxhash is a well-defined (stable) non-cryptographic hash optimized for
fast checksums (like crc32).
Collisions shouldn't be a problem, despite the reduced length:
- for actual file content (used to invalidate bg index shards), there
are only two versions that can collide (new shard and old shard).
- for file paths in bg index shard filenames, we would need 2^32 files
with the same filename to expect a collision. Imperfect hashing may
reduce this a bit but it's well beyond what's plausible.
This will invalidate shards on disk (as usual; I bumped the version),
but this time the filenames are changing so the old files will stick
around :-( So this is more expensive than the usual bump, but would be
good to land before the v9 branch when everyone will start using bg index.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64306
llvm-svn: 365311
Summary:
This builds on the relations support added in D59407, D62459,
and D62471 to expose relations in SymbolIndex and its
implementations.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D62839
llvm-svn: 363481
Summary:
After rL360344, BackgroundIndex expects symbols with zero refcounts.
Therefore existing index files are no longer valid.
Assertion regarding finding target of a reference was wrong, since
background-index might still be going on or we might've loaded only some part of
the shards and might be missing the declaring shards for the symbol.
Reviewers: ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61734
llvm-svn: 360349
Summary:
Part of re-landing rC356541 with D59599. Changes the way we store
template arguments, previous patch was storing them inside Name field of Symbol.
Which was violating the assumption:
```Symbol::Scope+Symbol::Name == clang::clangd::printQualifiedName```
which was made in multiple places inside codebase. This patch instead moves
those arguments into their own field. Currently the field is meant to be
human-readable, can be made structured if need be.
Reviewers: ioeric, ilya-biryukov, gribozavr
Subscribers: MaskRay, jkorous, arphaman, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59640
llvm-svn: 358273
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The new guideline is to qualify with 'llvm::' explicitly both in
'.h' and '.cpp' files. This simplifies moving the code between
header and source files and is easier to keep consistent.
llvm-svn: 350531
Summary:
Currently, there's no way of knowing about header files
using compilation database, since it doesn't contain header files as entries.
Using this information, restoring from cache using compile commands becomes
possible instead of doing directory traversal. Also, we can issue indexing
actions for out-of-date headers even if source files depending on them haven't
changed.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D54817
llvm-svn: 347669
Summary:
And add a hidden option to control whether the types are collected.
For experiments, will be removed when expected types implementation
is stabilized.
The index size is almost unchanged, e.g. the YAML index for all clangd
sources increased from 53MB to 54MB.
Reviewers: ioeric, sammccall
Reviewed By: sammccall
Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52274
llvm-svn: 347560
Summary:
Instead of passing around a list of supported URI schemes in clangd, we
expose an interface to convert a path to URI using any compatible scheme
that has been registered. It favors customized schemes and falls
back to "file" when no other scheme works.
Changes in this patch are:
- URI::create(AbsPath, URISchemes) -> URI::create(AbsPath). The new API finds a
compatible scheme from the registry.
- Remove URISchemes option everywhere (ClangdServer, SymbolCollecter, FileIndex etc).
- Unit tests will use "unittest" by default.
- Move "test" scheme from ClangdLSPServer to ClangdMain.cpp, and only
register the test scheme when lit-test or enable-lit-scheme is set.
(The new flag is added to make lit protocol.test work; I wonder if there
is alternative here.)
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54800
llvm-svn: 347467
Summary:
Puts the digest of the source file that generated the index into
serialized index and stores them back on load, if exists.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D54693
llvm-svn: 347235
Summary:
This is our goal. It has a non-zero rick, but so far we haven't see any
collision (externally and internally).
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D54622
llvm-svn: 347044
Summary:
This would save us 8 bytes per ref, and buy us ~40MB in total
for llvm index (from ~300MB to ~260 MB).
The char pointer must be null-terminated, and llvm::StringSaver
guarantees it.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53427
llvm-svn: 346852
Summary:
The goal is 8 bytes, which has a nonzero risk of collisions with huge indexes.
This patch should shake out any issues with truncation at all, we can lower
further later.
Reviewers: ioeric
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53587
llvm-svn: 345113
Standardize on the most common namespace setup in our *.cpp files:
using namespace llvm;
namespace clang {
namespace clangd {
void foo(StringRef) { ... }
And remove redundant llvm:: qualifiers. (Except for cases like
make_unique where this causes problems with std:: and ADL).
This choice is pretty arbitrary, but some broad consistency is nice.
This is going to conflict with everything. Sorry :-/
Squash the other configurations:
A)
using namespace llvm;
using namespace clang;
using namespace clangd;
void clangd::foo(StringRef);
This is in some of the older files. (It prevents accidentally defining a
new function instead of one in the header file, for what that's worth).
B)
namespace clang {
namespace clangd {
void foo(llvm::StringRef) { ... }
This is fine, but in practice the using directive often gets added over time.
C)
namespace clang {
namespace clangd {
using namespace llvm; // inside the namespace
This was pretty common, but is a bit misleading: name lookup preferrs
clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using
directive is).
llvm-svn: 344850
Summary:
The RefSlab::size can easily cause confusions, it returns the number of
different symbols, rahter than the number of all references.
- add numRefs() method and cache it, since calculating it everytime is nontrivial.
- clear misused places.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53389
llvm-svn: 344745
Summary:
This would buy us more memory. Using a 32-bits integer is enough for
most human-readable source code (up to 4M lines and 4K columns).
Previsouly, we used 8 bytes for a position, now 4 bytes, it would save
us 8 bytes for each Ref and each Symbol instance.
For LLVM-project binary index file, we save ~13% memory.
| Before | After |
| 412MB | 355MB |
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D53363
llvm-svn: 344735
* With the current implementation, `sizeof(std::vector<Chunk>)` is added
twice to the `Dex` memory estimate which is incorrect
* `Dex` logs memory usage estimation before `BackingDataSize` is set and
hence the log report excludes size of the external `SymbolSlab` which is
coupled with `Dex` instance
Reviewed By: ioeric
Differential Revision: https://reviews.llvm.org/D52503
llvm-svn: 343117