Summary:
Currently SHA1 is about 10% of our CPU, this patch reduces it to ~1%.
xxhash is a well-defined (stable) non-cryptographic hash optimized for
fast checksums (like crc32).
Collisions shouldn't be a problem, despite the reduced length:
- for actual file content (used to invalidate bg index shards), there
are only two versions that can collide (new shard and old shard).
- for file paths in bg index shard filenames, we would need 2^32 files
with the same filename to expect a collision. Imperfect hashing may
reduce this a bit but it's well beyond what's plausible.
This will invalidate shards on disk (as usual; I bumped the version),
but this time the filenames are changing so the old files will stick
around :-( So this is more expensive than the usual bump, but would be
good to land before the v9 branch when everyone will start using bg index.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64306
llvm-svn: 365311
Summary:
Previously, when we rename a macro, we get an error message of "there is
no symbol found".
This patch improves the message of this case (as we don't support macros).
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63922
llvm-svn: 364735
Summary:
Also lift it to SourceCode.h, so that it can be used in other places
(semantic code highlighting).
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D63714
llvm-svn: 364280
Summary:
The hope is this will catch a few patterns with repetition:
SomeClass* S = ^SomeClass::Create()
int getFrobnicator() { return ^frobnicator_; }
// discard the factory, it's no longer valid.
^MyFactory.reset();
Without triggering antipatterns too often:
return Point(x.first, x.^second);
I'm going to gather some data on whether this turns out to be a win overall.
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61537
llvm-svn: 360030
Summary: We scrape the enclosing scopes from the source file, and use them in the query.
Reviewers: kadircet
Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D61077
llvm-svn: 359284
Summary:
o Lex the code to get the identifiers and put them into a "symbol" index.
o Adds a new completion mode without compilation/sema into code completion workflow.
o Make IncludeInserter work even when no compile command is present, by avoiding
inserting non-verbatim headers.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D60126
llvm-svn: 358159
Summary:
(Changes to UTF-8/UTF-16 here are NFC, moving things around to make the
cases more symmetrical)
Reviewers: ilya-biryukov
Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D59927
llvm-svn: 357173
Summary:
Still some pieces to go here: unit tests for new SourceCode functionality and
a command-line flag to force utf-8 mode. But wanted to get early feedback.
Reviewers: hokein
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58275
llvm-svn: 357102
FileManager constructs a VFS in its constructor if it isn't passed one,
and there's no way to reset it. Make that contract clear by returning a
reference from its accessor.
https://reviews.llvm.org/D59388
llvm-svn: 357038
Summary:
The code tweaks are an implementation of mini-refactorings exposed
via the LSP code actions. They run in two stages:
- Stage 1. Decides whether the action is available to the user and
collects all the information required to finish the action.
Should be cheap, since this will run over all the actions known to
clangd on each textDocument/codeAction request from the client.
- Stage 2. Uses information from stage 1 to produce the actual edits
that the code action should perform. This stage can be expensive and
will only run if the user chooses to perform the specified action in
the UI.
One unfortunate consequence of this change is increased latency of
processing the textDocument/codeAction requests, which now wait for an
AST. However, we cannot avoid this with what we have available in the LSP
today.
Reviewers: kadircet, ioeric, hokein, sammccall
Reviewed By: sammccall
Subscribers: mgrang, mgorny, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D56267
llvm-svn: 352494
Summary:
This enables clangd to intercept compiler diagnostics and attach fixes (e.g. by
querying index). This patch adds missing includes for incomplete types e.g.
member access into class with only forward declaration. This would allow adding
missing includes for user-typed symbol names that are missing declarations
(e.g. typos) in the future.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: mgorny, ilya-biryukov, javed.absar, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D56903
llvm-svn: 352361
to reflect the new license.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351636
The new guideline is to qualify with 'llvm::' explicitly both in
'.h' and '.cpp' files. This simplifies moving the code between
header and source files and is easier to keep consistent.
llvm-svn: 350531
Summary:
This only changes behavior in cases when the file itself is a symlink.
When canonicalizing paths do not look at tryGetRealPathName, which
contains the resolved path for files that are symlinks. Instead first build the
absolute path even if it contains some symlinks on the path. Then resolve only
the symlinks on the path and leave it as it is if the file itself is a symlink.
Reviewers: ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D56263
llvm-svn: 350306
Summary:
There were a few different places where we canonicalized paths, each
one had its own flavor. This patch tries to unify them all under one place.
Reviewers: ilya-biryukov
Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D55818
llvm-svn: 349618
Summary:
Currently, there's no way of knowing about header files
using compilation database, since it doesn't contain header files as entries.
Using this information, restoring from cache using compile commands becomes
possible instead of doing directory traversal. Also, we can issue indexing
actions for out-of-date headers even if source files depending on them haven't
changed.
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D54817
llvm-svn: 347669
Summary:
See http://lists.llvm.org/pipermail/clangd-dev/2018-October/000171.html for context.
I kept the error (instead of downgrading to a log message) since the range lengths differing does indicate either a bug in the client or server range calculation or the buffers being out of sync (which both seems serious enough to me to be an error). If any existing clients aside from VSCode break they should only break when accidentally typing a Unicode character which should only be a minor nuisance for a little while until the bug is fixed in the respective client.
Patch by Daan De Meyer!
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: ilya-biryukov, ioeric, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D53527
llvm-svn: 345020
Standardize on the most common namespace setup in our *.cpp files:
using namespace llvm;
namespace clang {
namespace clangd {
void foo(StringRef) { ... }
And remove redundant llvm:: qualifiers. (Except for cases like
make_unique where this causes problems with std:: and ADL).
This choice is pretty arbitrary, but some broad consistency is nice.
This is going to conflict with everything. Sorry :-/
Squash the other configurations:
A)
using namespace llvm;
using namespace clang;
using namespace clangd;
void clangd::foo(StringRef);
This is in some of the older files. (It prevents accidentally defining a
new function instead of one in the header file, for what that's worth).
B)
namespace clang {
namespace clangd {
void foo(llvm::StringRef) { ... }
This is fine, but in practice the using directive often gets added over time.
C)
namespace clang {
namespace clangd {
using namespace llvm; // inside the namespace
This was pretty common, but is a bit misleading: name lookup preferrs
clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using
directive is).
llvm-svn: 344850
Summary:
When compile_commands.json contains some source files expressed as
relative paths, we can get duplicate responses to findDefinitions. The
responses only differ by the URI, which are different versions of the
same file:
"result": [
{
...
"uri": "file:///home/emaisin/src/ls-interact/cpp-test/build/../src/first.h"
},
{
...
"uri": "file:///home/emaisin/src/ls-interact/cpp-test/src/first.h"
}
]
In getAbsoluteFilePath, we try to obtain the realpath of the FileEntry
by calling tryGetRealPathName. However, this can fail and return an
empty string. It may be bug a bug in clang, but in any case we should
fall back to computing it ourselves if it happens.
I changed getAbsoluteFilePath so that if tryGetRealPathName succeeds, we
return right away (a real path is always absolute). Otherwise, we try
to build an absolute path, as we did before, but we also call
VFS->getRealPath to make sure to get the canonical path (e.g. without
any ".." in it).
Reviewers: malaperle
Subscribers: hokein, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D48687
llvm-svn: 339483
Summary:
log() is split into four functions:
- elog()/log()/vlog() have different severity levels, allowing filtering
- dlog() is a lazy macro which uses LLVM_DEBUG - it logs to the logger, but
conditionally based on -debug-only flag and is omitted in release builds
All logging functions use formatv-style format strings now, e.g:
log("Could not resolve URI {0}: {1}", URI, Result.takeError());
Existing log sites have been split between elog/log/vlog by best guess.
This includes a workaround for passing Error to formatv that can be
simplified when D49170 or similar lands.
Subscribers: ilya-biryukov, javed.absar, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D49008
llvm-svn: 336785
Summary:
An AST-based approach is used to retrieve the document symbols rather than an
in-memory index query. The index is not an ideal fit to achieve this because of
the file-centric query being done here whereas the index is suited for
project-wide queries. Document symbols also includes more symbols and need to
keep the order as seen in the file.
Signed-off-by: Marc-Andre Laperle <marc-andre.laperle@ericsson.com>
Subscribers: tomgr, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D47846
llvm-svn: 336386
Summary:
The Language Server Protocol unfortunately mandates that locations in files
be represented by line/column pairs, where the "column" is actually an index
into the UTF-16-encoded text of the line.
(This is because VSCode is written in JavaScript, which is UTF-16-native).
Internally clangd treats source files at UTF-8, the One True Encoding, and
generally deals with byte offsets (though there are exceptions).
Before this patch, conversions between offsets and LSP Position pretended
that Position.character was UTF-8 bytes, which is only true for ASCII lines.
Now we examine the text to convert correctly (but don't actually need to
transcode it, due to some nice details of the encodings).
The updated functions in SourceCode are the blessed way to interact with
the Position.character field, and anything else is likely to be wrong.
So I also updated the other accesses:
- CodeComplete needs a "clang-style" line/column, with column in utf-8 bytes.
This is now converted via Position -> offset -> clang line/column
(a new function is added to SourceCode.h for the second conversion).
- getBeginningOfIdentifier skipped backwards in UTF-16 space, which is will
behave badly when it splits a surrogate pair. Skipping backwards in UTF-8
coordinates gives the lexer a fighting chance of getting this right.
While here, I clarified(?) the logic comments, fixed a bug with identifiers
containing digits, simplified the signature slightly and added a test.
This seems likely to cause problems with editors that have the same bug, and
treat the protocol as if columns are UTF-8 bytes. But we can find and fix those.
Reviewers: hokein
Subscribers: klimek, ilya-biryukov, ioeric, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46035
llvm-svn: 331029
Summary:
This is a basic implementation of the "workspace/symbol" request which is
used to find symbols by a string query. Since this is similar to code completion
in terms of result, this implementation reuses the "fuzzyFind" in order to get
matches. For now, the scoring algorithm is the same as code completion and
improvements could be done in the future.
The index model doesn't contain quite enough symbols for this to cover
common symbols like methods, enum class enumerators, functions in unamed
namespaces, etc. The index model will be augmented separately to achieve this.
Reviewers: sammccall, ilya-biryukov
Reviewed By: sammccall
Subscribers: jkorous, hokein, simark, sammccall, klimek, mgorny, ilya-biryukov, mgrang, jkorous-apple, ioeric, MaskRay, cfe-commits
Differential Revision: https://reviews.llvm.org/D44882
llvm-svn: 330637
Summary:
To implement incremental document syncing, we want to verify that the
ranges provided by the front-end are valid. Currently, positionToOffset
deals with invalid Positions by returning 0 or Code.size(), which are
two valid offsets. Instead, return an llvm:Expected<size_t> with an
error if the position is invalid.
According to the LSP, if the character value exceeds the number of
characters of the given line, it should default back to the end of the
line. It makes sense in some contexts to have this behavior, and does
not in other contexts. The AllowColumnsBeyondLineLength parameter
allows to decide what to do in that case, default back to the end of the
line, or return an error.
Reviewers: ilya-biryukov
Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44673
llvm-svn: 328100
Summary:
The new implementation attaches notes to diagnostic message and shows
the original diagnostics in the message of the note.
Reviewers: hokein, ioeric, sammccall
Reviewed By: sammccall
Subscribers: klimek, mgorny, cfe-commits, jkorous-apple
Differential Revision: https://reviews.llvm.org/D44142
llvm-svn: 327282
Summary:
Some of the existing structs had primimtive fields that were
not explicitly initialized on construction.
After this commit every struct consistently sets a defined value for
every field when default-initialized.
Reviewers: hokein, ioeric, sammccall
Reviewed By: sammccall
Subscribers: klimek, cfe-commits, jkorous-apple
Differential Revision: https://reviews.llvm.org/D43230
llvm-svn: 325113
Summary:
- Moved these functions to SourceCode.h
- added unit tests
- fix off by one in positionToOffset: Offset - 1 in final calculation was wrong
- fixed formatOnType which had an equal and opposite off-by-one
- positionToOffset and offsetToPosition both consistently clamp to beginning/end
of file when input is out of range
- gave variables more descriptive names
- removed windows line ending fixmes where there is nothing to fix
- elaborated on UTF-8 fixmes
This will conflict with Eric's D41281, but in a pretty easy-to-resolve way.
Reviewers: ioeric
Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits
Differential Revision: https://reviews.llvm.org/D41351
llvm-svn: 321073