llvm-project

Commit Graph

Author	SHA1	Message	Date
Sam McCall	668ac94ba4	[clangd] Truncate SymbolID to 16 bytes. Summary: The goal is 8 bytes, which has a nonzero risk of collisions with huge indexes. This patch should shake out any issues with truncation at all, we can lower further later. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53587 llvm-svn: 345113	2018-10-24 06:58:42 +00:00
Sam McCall	c008af6466	[clangd] Namespace style cleanup in cpp files. NFC. Standardize on the most common namespace setup in our *.cpp files: using namespace llvm; namespace clang { namespace clangd { void foo(StringRef) { ... } And remove redundant llvm:: qualifiers. (Except for cases like make_unique where this causes problems with std:: and ADL). This choice is pretty arbitrary, but some broad consistency is nice. This is going to conflict with everything. Sorry :-/ Squash the other configurations: A) using namespace llvm; using namespace clang; using namespace clangd; void clangd::foo(StringRef); This is in some of the older files. (It prevents accidentally defining a new function instead of one in the header file, for what that's worth). B) namespace clang { namespace clangd { void foo(llvm::StringRef) { ... } This is fine, but in practice the using directive often gets added over time. C) namespace clang { namespace clangd { using namespace llvm; // inside the namespace This was pretty common, but is a bit misleading: name lookup preferrs clang::clangd::foo > clang::foo > llvm:: foo (no matter where the using directive is). llvm-svn: 344850	2018-10-20 15:30:37 +00:00
Haojian Wu	812b6c51c3	[clangd] Remove the overflow log. Summary: LLVM codebase has generated files (all are build/Target/XXX/*.inc) that exceed the MaxLine & MaxColumn. Printing these log would be noisy. Reviewers: sammccall Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53400 llvm-svn: 344777	2018-10-19 08:35:24 +00:00
Haojian Wu	6ece6e7dad	[clangd] Clear the semantic of RefSlab::size. Summary: The RefSlab::size can easily cause confusions, it returns the number of different symbols, rahter than the number of all references. - add numRefs() method and cache it, since calculating it everytime is nontrivial. - clear misused places. Reviewers: sammccall Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53389 llvm-svn: 344745	2018-10-18 15:33:20 +00:00
Haojian Wu	b515fabb3b	[clangd] Encode Line/Column as a 32-bits integer. Summary: This would buy us more memory. Using a 32-bits integer is enough for most human-readable source code (up to 4M lines and 4K columns). Previsouly, we used 8 bytes for a position, now 4 bytes, it would save us 8 bytes for each Ref and each Symbol instance. For LLVM-project binary index file, we save ~13% memory. \| Before \| After \| \| 412MB \| 355MB \| Reviewers: sammccall Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D53363 llvm-svn: 344735	2018-10-18 10:43:50 +00:00
Jonas Toth	3acdd020b4	[clangd] fix miscompiling lower_bound call llvm-svn: 344044	2018-10-09 13:24:50 +00:00
Kirill Bobyrev	4a5ff88fdb	[clangd] NFC: Migrate to LLVM STLExtras API where possible This patch improves readability by migrating `std::function(ForwardIt start, ForwardIt end, ...)` to LLVM's STLExtras range-based equivalent `llvm::function(RangeT &&Range, ...)`. Similar change in Clang: D52576. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D52650 llvm-svn: 343937	2018-10-07 14:49:41 +00:00
Sam McCall	46b5555844	[clangd] Fix error handling for SymbolID parsing (notably YAML and dexp) llvm-svn: 342505	2018-09-18 19:00:59 +00:00
Kirill Bobyrev	e6dd0806c7	[clangd] Cleanup FuzzyFindRequest filtering limit semantics As discussed during D51860 review, it is better to use `llvm::Optional` here as it has clear semantics which reflect intended behavior. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D52028 llvm-svn: 342138	2018-09-13 14:27:03 +00:00
Kirill Bobyrev	0dee397e06	[clangd] NFC: Use uint32_t for FuzzyFindRequest limits Reviewed By: ioeric Differential Revision: https://reviews.llvm.org/D51860 llvm-svn: 341921	2018-09-11 10:31:38 +00:00
Kirill Bobyrev	5faf8a3d84	[clangd] Unbreak buildbots after r341802 Solution: use std::move when returning result from toJSON(...). llvm-svn: 341832	2018-09-10 14:31:38 +00:00
Kirill Bobyrev	09f00dcf69	[clangd] Implement FuzzyFindRequest JSON (de)serialization JSON (de)serialization of `FuzzyFindRequest` might be useful for both D51090 and D51628. Also, this allows precise logging of the fuzzy find requests. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D51852 llvm-svn: 341802	2018-09-10 11:51:05 +00:00
Eric Liu	6df66001ee	[clangd] Add "Deprecated" field to Symbol and CodeCompletion. Summary: Also set "deprecated" field in LSP CompletionItem. Reviewers: sammccall, kadircet Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Differential Revision: https://reviews.llvm.org/D51724 llvm-svn: 341576	2018-09-06 18:52:26 +00:00
Sam McCall	50f3631057	[clangd] Define a compact binary serialization fomat for symbol slab/index. Summary: This is intended to replace the current YAML format for general use. It's ~10x more compact than YAML, and ~40% more compact than gzipped YAML: llvmidx.riff = 20M, llvmidx.yaml = 272M, llvmidx.yaml.gz = 32M It's also simpler/faster to read and write. The format is a RIFF container (chunks of (type, size, data)) with: - a compressed string table - simple binary encoding of symbols (with varints for compactness) It can be extended to include occurrences, Dex posting lists, etc. There's no rich backwards-compatibility scheme, but a version number is included so we can detect incompatible files and do ad-hoc back-compat. Alternatives considered: - compressed YAML or JSON: bulky and slow to load - llvm bitstream: confusing model and libraries are hard to use. My attempt produced slightly larger files, and the code was longer and slower. - protobuf or similar: would be really nice (esp for back-compat) but the dependency is a big hassle - ad-hoc binary format without a container: it seems clear we're going to add posting lists and occurrences here, and that they will benefit from sharing a string table. The container makes it easy to debug these pieces in isolation, and make them optional. Reviewers: ioeric Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D51585 llvm-svn: 341375	2018-09-04 16:16:50 +00:00
Kirill Bobyrev	cc8b507a60	[clangd] NFC: Change quality type to float Reviewed by: sammccall Differential Revision: https://reviews.llvm.org/D51636 llvm-svn: 341374	2018-09-04 15:45:56 +00:00
Sam McCall	b0138317d6	[clangd] SymbolOccurrences -> Refs and cleanup Summary: A few things that I noticed while merging the SwapIndex patch: - SymbolOccurrences and particularly SymbolOccurrenceSlab are unwieldy names, and these names appear a lot. Ref, RefSlab, etc seem clear enough and read/format much better. - The asymmetry between SymbolSlab and RefSlab (build() vs freeze()) is confusing and irritating, and doesn't even save much code. Avoiding RefSlab::Builder was my idea, but it was a bad one; add it. - DenseMap<SymbolID, ArrayRef<Ref>> seems like a reasonable compromise for constructing MemIndex - and means many less wasted allocations than the current DenseMap<SymbolID, vector<Ref*>> for FileIndex, and none for slabs. - RefSlab::find() is not actually used for anything, so we can throw away the DenseMap and keep the representation much more compact. - A few naming/consistency fixes: e.g. Slabs,Refs -> Symbols,Refs. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D51605 llvm-svn: 341368	2018-09-04 14:39:56 +00:00
Sam McCall	9c7624e14b	[clangd] Factor out the data-swapping functionality from MemIndex/DexIndex. Summary: This is now handled by a wrapper class SwapIndex, so MemIndex/DexIndex can be immutable and focus on their job. Old and busted: I have a MemIndex, which holds a shared_ptr<vector<Symbol>>, which keeps the symbol slab alive. I update by calling build(shared_ptr<vector<Symbol>>). New hotness: I have a SwapIndex, which holds a unique_ptr<SymbolIndex>, which holds a MemIndex, which holds a shared_ptr<void>, which keeps backing data alive. I update by building a new MemIndex and calling SwapIndex::reset(). Reviewers: kbobyrev, ioeric Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D51422 llvm-svn: 341318	2018-09-03 14:37:43 +00:00
Eric Liu	83f63e42b2	[clangd] Support multiple #include headers in one symbol. Summary: Currently, a symbol can have only one #include header attached, which might not work well if the symbol can be imported via different #includes depending on where it's used. This patch stores multiple #include headers (with # references) for each symbol, so that CodeCompletion can decide which include to insert. In this patch, code completion simply picks the most popular include as the default inserted header. We also return all possible includes and their edits in the `CodeCompletion` results. Reviewers: sammccall Reviewed By: sammccall Subscribers: mgrang, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D51291 llvm-svn: 341304	2018-09-03 10:18:21 +00:00
Fangrui Song	399943bc76	[clangd] Fix many typos. NFC llvm-svn: 341273	2018-09-01 07:47:03 +00:00
Sam McCall	2e5700f038	[clangd] Flatten out Symbol::Details. It was ill-conceived, sorry. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Differential Revision: https://reviews.llvm.org/D51504 llvm-svn: 341211	2018-08-31 13:55:01 +00:00
Haojian Wu	d81e3146e3	[clangd] Collect symbol occurrences in SymbolCollector. SymbolCollector will be used for two cases: - collect Symbol type only, used for indexing preamble AST. - collect Symbol and SymbolOccurrences, used for indexing main AST. For finding local references from the AST, we will implement it in other ways. llvm-svn: 341208	2018-08-31 12:54:13 +00:00
Haojian Wu	931b2262d4	[clangd] Simplify the code using UniqueStringSaver, NFC. llvm-svn: 340161	2018-08-20 09:47:12 +00:00
Sam McCall	4e5742a479	[clangd] Make SymbolOrigin an enum class, rather than a plain enum. I never intended to define namespace pollution like clangd::AST, clangd::Unknown etc. Oops! llvm-svn: 336431	2018-07-06 11:50:49 +00:00
Sam McCall	2161ec7ee2	[clangd] Track origins of symbols (various indexes, Sema). Summary: Surface it in the completion items C++ API, and when a flag is set. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48938 llvm-svn: 336309	2018-07-05 06:20:41 +00:00
Sam McCall	a68951e37e	[clangd] More precise representation of symbol names/labels in the index. Summary: Previously, the strings matched LSP completion pretty closely. The completion label was a single string, for instance. This made implementing completion itself easy but makes it hard to use the names in other way, e.g. pretty-printed name in synthesized documentation/hover. It also limits our introspection into completion items, which can only be as precise as the indexed symbols. This change is a prerequisite to improvements to overload bundling which need to inspect e.g. signature structure. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48475 llvm-svn: 335360	2018-06-22 16:11:35 +00:00
Sam McCall	032db94ac9	[clangd] Remove FilterText from the index. Summary: It's almost always identical to Name, and in fact we never used it (we used name instead). The only case where they differ is objc method selectors (foo: vs foo:bar:). We can live with the latter for both name and filterText, so I've made that change too. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48375 llvm-svn: 335321	2018-06-22 06:41:43 +00:00
Sam McCall	dc8abc45d2	[clangd] Incorporate #occurrences in scoring code complete results. Summary: needs tests Reviewers: ilya-biryukov Subscribers: klimek, ioeric, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D46183 llvm-svn: 331457	2018-05-03 14:53:02 +00:00
Haojian Wu	cbf20ef6ab	[clangd] Add "str()" method to SymbolID. Summary: This is a convenient function when we try to get std::string of SymbolID. Reviewers: ioeric Subscribers: klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D46065 llvm-svn: 330835	2018-04-25 15:27:09 +00:00
Haojian Wu	545c02a710	[clangd] Add line and column number to the index symbol. Summary: LSP is using Line & column as symbol position, clangd needs to transfer file offset to Line & column when sending results back to LSP client, which is a high cost, especially for finding workspace symbol -- we have to read the file content from disk (if it isn't loaded in memory). Saving these information in the index will make the clangd life eaiser. Reviewers: sammccall Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, MaskRay, cfe-commits Differential Revision: https://reviews.llvm.org/D45513 llvm-svn: 329997	2018-04-13 08:30:39 +00:00
Eric Liu	c5105f9e3c	[clangd] collect symbol #include & insert #include in global code completion. Summary: o Collect suitable #include paths for index symbols. This also does smart mapping for STL symbols and IWYU pragma (code borrowed from include-fixer). o For global code completion, add a command for inserting new #include in each code completion item. Reviewers: sammccall Reviewed By: sammccall Subscribers: klimek, mgorny, ilya-biryukov, jkorous-apple, hintonda, cfe-commits Differential Revision: https://reviews.llvm.org/D42640 llvm-svn: 325343	2018-02-16 14:15:55 +00:00
Haojian Wu	dc02a3d943	[clangd] SymbolLocation only covers symbol name. Summary: * Change the offset range to half-open, [start, end). * Fix a few fixmes. Reviewers: sammccall Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits Differential Revision: https://reviews.llvm.org/D43182 llvm-svn: 324992	2018-02-13 09:53:50 +00:00
Sam McCall	6003951c66	[clangd] Collect definitions when indexing. Within a TU: - as now, collect a declaration from the first occurrence of a symbol (taking clang's canonical declaration) - when we first see a definition occurrence, copy the symbol and add it Across TUs/sources: - mergeSymbol in Merge.h is responsible for combining matching Symbols. This covers dynamic/static merges and cross-TU merges in the static index. - it prefers declarations from Symbols that have a definition. - GlobalSymbolBuilderMain is modified to use mergeSymbol as a reduce step. Random cleanups (can be pulled out): - SymbolFromYAML -> SymbolsFromYAML, new singular SymbolFromYAML added - avoid uninit'd SymbolLocations. Add an idiomatic way to check "absent". - CanonicalDeclaration (as well as Definition) are mapped as optional in YAML. - added operator<< for Symbol & SymbolLocation, for debugging Reviewers: ioeric, hokein Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits Differential Revision: https://reviews.llvm.org/D42942 llvm-svn: 324735	2018-02-09 14:42:01 +00:00
Eric Liu	7f24765912	[clangd] Use URIs in index symbols. Reviewers: hokein, sammccall Reviewed By: sammccall Subscribers: klimek, ilya-biryukov, jkorous-apple, cfe-commits Differential Revision: https://reviews.llvm.org/D42915 llvm-svn: 324358	2018-02-06 16:10:35 +00:00
Sam McCall	e2f43f500a	[clangd] Improve const-correctness of Symbol->Detail. NFC Summary: This would have caught a bug I wrote in an early version of D42049, where an index user could overwrite data internal to the index because the Symbol is not deep-const. The YAML traits are now a bit more verbose, but separate concerns a bit more nicely: ArenaPtr can be reused for other similarly-allocated objects, including scalars etc. Reviewers: hokein Subscribers: klimek, ilya-biryukov, cfe-commits, ioeric Differential Revision: https://reviews.llvm.org/D42059 llvm-svn: 322509	2018-01-15 20:09:09 +00:00
Eric Liu	76f6b44443	[clangd] Add more symbol information for code completion. Reviewers: hokein, sammccall Reviewed By: sammccall Subscribers: klimek, ilya-biryukov, cfe-commits Differential Revision: https://reviews.llvm.org/D41345 llvm-svn: 322097	2018-01-09 17:32:00 +00:00
Benjamin Kramer	50a967d601	[clangd] Simplify code. No functionality change intended. llvm-svn: 321523	2017-12-28 14:47:01 +00:00
Sam McCall	4b9bbb378b	[clangd] Use Builder for symbol slabs, and use sorted-vector for storage Summary: This improves a few things: - the insert -> freeze -> read sequence is now enforced/communicated by the type system - SymbolSlab::const_iterator iterates over symbols, not over id-symbol pairs - we avoid permanently storing a second copy of the IDs, and the string map's hashtable The slab size is now down to 21.8MB for the LLVM project. Of this only 2.7MB is strings, the rest is #symbols * `sizeof(Symbol)`. `sizeof(Symbol)` is currently 96, which seems too big - I think SymbolInfo isn't efficiently packed. That's a topic for another patch! Also added simple API to see the memory usage/#symbols of a slab, since it seems likely we will continue to care about this. Reviewers: ilya-biryukov Subscribers: klimek, mgrang, cfe-commits Differential Revision: https://reviews.llvm.org/D41506 llvm-svn: 321412	2017-12-23 19:38:03 +00:00
Sam McCall	df898cc5ed	[clangd] Don't re-hash SymbolID in maps, just use the SHA1 data llvm-svn: 321302	2017-12-21 20:11:46 +00:00
Sam McCall	6c0d0f5775	[clangd] Index symbols share storage within a slab. Summary: Symbols are not self-contained - it's only safe to hand them out if you guarantee the lifetime of the underlying data. Before this lands, I'm going to measure the before/after memory usage of the LLVM index loaded into memory in a single slab. Reviewers: hokein Subscribers: klimek, ilya-biryukov, cfe-commits Differential Revision: https://reviews.llvm.org/D41483 llvm-svn: 321272	2017-12-21 14:58:44 +00:00
Eric Liu	b99d5e8b62	[clangd] Put all #includes in one block in clangd source files. NFC Clang-format categorizes and sorts #includes with style. It doesn't make sense to manually managing #include blocks. llvm-svn: 320743	2017-12-14 21:22:03 +00:00
Haojian Wu	56a5fca473	[clangd] Construct SymbolSlab from YAML format. Summary: This will be used together with D40548 for the global index source (experimental). Reviewers: sammccall Reviewed By: sammccall Subscribers: klimek, mgorny, ilya-biryukov, cfe-commits, ioeric Differential Revision: https://reviews.llvm.org/D41178 llvm-svn: 320694	2017-12-14 12:17:14 +00:00
Ilya Biryukov	5a85b8e6dd	[clangd] clang-format the source code. NFC llvm-svn: 320577	2017-12-13 12:53:16 +00:00
Haojian Wu	4c1394d67d	[clangd] Introduce a "Symbol" class. Summary: * The "Symbol" class represents a C++ symbol in the codebase, containing all the information of a C++ symbol needed by clangd. clangd will use it in clangd's AST/dynamic index and global/static index (code completion and code navigation). * The SymbolCollector (another IndexAction) will be used to recollect the symbols when the source file is changed (for ASTIndex), or to generate all C++ symbols for the whole project. In the long term (when index-while-building is ready), clangd should share a same "Symbol" structure and IndexAction with index-while-building, but for now we want to have some stuff working in clangd. Reviewers: ioeric, sammccall, ilya-biryukov, malaperle Reviewed By: sammccall Subscribers: malaperle, klimek, mgorny, cfe-commits Differential Revision: https://reviews.llvm.org/D40897 llvm-svn: 320486	2017-12-12 15:42:10 +00:00

43 Commits