The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
Summary:
The CheckAtomic module performs two tests to determine if passing
'-latomic' to the linker is required: one for 64-bit atomics, and
another for non-64-bit atomics. clangd only uses the result from
HAVE_CXX_ATOMICS64_WITHOUT_LIB. This is incomplete because there are
uses of non-64-bit atomics in the code, such as the ReplyOnce::Replied
of type std::atomic<bool> defined in clangd/ClangdLSPServer.cpp.
Fix by also checking for the result of HAVE_CXX_ATOMICS_WITHOUT_LIB.
See also: https://reviews.llvm.org/D68964
Reviewers: ilya-biryukov, nridge, kadircet, beanz, compnerd, luismarques
Reviewed By: luismarques
Tags: #clang
Differential Revision: https://reviews.llvm.org/D69869
Summary:
Currently template parameters has symbolkind `Unknown`. This patch
introduces a new kind `TemplateParm` for templatetemplate, templatetype and
nontypetemplate parameters.
Also adds tests in clangd hover feature.
Reviewers: sammccall
Subscribers: kristof.beyls, ilya-biryukov, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73696
Summary:
Make it possible for the client to adjust the ranking by using the score Clangd
calculates for the completion items.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74547
Summary:
Though this is not needed when using clangd's own index, other indexes
(e.g. kythe) need it, as classes and their constructors are different
symbols, otherwise we will miss renaming constructors.
Reviewers: kbobyrev
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74411
Summary:
Informative only, useful for positioning UI, interacting with other sources of
completion etc. As requested by an embedder of clangd.
Reviewers: usaxena95
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74305
Summary:
When renaming a class with template constructors, we are missing the
occurrences of the template constructors, because getUSRsForDeclaration doesn't
give USRs of the templated constructors (they are not in the normal `ctors()`
method).
Reviewers: kbobyrev
Subscribers: jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74216
The test got re-enabled after d54d71b67e landed.
However it seems that the order is still not deterministic as it
currently passes with -DLLVM_ENABLE_EXPENSIVE_CHECKS=OFF but randomly
fails with expensive checks ON.
Summary:
- This option forces a preamble rebuild to handle the odd case
of a missing header file being added
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, jfb, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73916
Summary:
Clangd does not find references of designated iniitializers yet and, as a
result, is unable to rename such references. This patch addresses this issue.
Resolves: https://github.com/clangd/clangd/issues/247
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: merge_guards_bot, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72867
Summary:
DeclarationName for cxx constructor is special, it is not an identifier.
thus the "Spelled" flag are not set for all ctor references, this patch
fixes it.
Reviewers: kbobyrev
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74125
This patch is based on D72746 and prevents non-spelled references from
being renamed which would cause incorrect behavior otherwise.
Reviewed by: hokein
Differential Revision: https://reviews.llvm.org/D74112
This patch allows the index does to provide a way to distinguish
implicit references (e.g. coming from macro expansions) from the spelled
ones. The corresponding flag was added to RefKind and symbols that are
referenced without spelling their name explicitly are now marked
implicit. This allows fixing incorrect behavior when renaming a symbol
that was referenced in macro expansions would try to rename macro
invocations.
Differential Revision: D72746
Reviewed by: hokein
Summary:
This is a fairly ugly hack - we back off several features for any variable
whose type isn't deduced, to avoid computing/caching linkage.
Better suggestions welcome.
Fixes https://github.com/clangd/clangd/issues/274
Reviewers: kadircet, kbobyrev
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73960
Summary: By default it's 512K, which is way to small for clang parser to run on. There is no way to do it via platform-independent API, so it's implemented via pthreads directly in clangd/Threading.cpp.
Fixes https://github.com/clangd/clangd/issues/273
Patch by Dmitry Kozhevnikov!
Reviewers: ilya-biryukov, sammccall, arphaman
Reviewed By: ilya-biryukov, sammccall, arphaman
Subscribers: dexonsmith, umanwizard, jfb, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D50993
Summary:
Currently we delay AST rebuilds by 500ms after each edit, to wait for
further edits. This is a win if a rebuild takes 5s, and a loss if it
takes 50ms.
This patch sets debouncepolicy = clamp(min, ratio * rebuild_time, max).
However it sets min = max = 500ms so there's no policy change or actual
customizability - will do that in a separate patch.
See https://github.com/clangd/clangd/issues/275
Reviewers: hokein
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73873
Summary:
Default args might exist but be unparsed or uninstantiated.
getDefaultArg asserts on those. This patch makes sure we don't crash in such
scenarios.
Reviewers: sammccall, ilya-biryukov
Subscribers: MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73723
Summary:
Previously we mmapped on unix and not on windows: on windows mmap takes
an exclusive lock on the file and prevents the user saving changes!
The failure mode on linux is a bit more subtle: if the file is changed on disk
but the SourceManager sticks around, then subsequent operations on the
SourceManager will fail as invariants are violated (e.g. null-termination).
This commonly manifests as crashes after switching git branches with many files
open in clangd.
Nominally mmap is for performance here, and we should be willing to give some
up to stop crashing. Measurements on my system (linux+desktop+SSD) at least
show no measurable regression on an a fairly IO-heavy workload: drop disk caches,
open SemaOverload.cpp, wait for first diagnostics.
for i in `seq 100`; do
for variant in mmap volatile; do
echo 3 | sudo tee /proc/sys/vm/drop_caches
/usr/bin/time --append --quiet -o ~/timings -f "%C %E" \
bin/clangd.$variant -sync -background-index=0 < /tmp/mirror > /dev/null
done
done
bin/clangd.mmap -sync -background-index=0 0:07.60
bin/clangd.volatile -sync -background-index=0 0:07.89
bin/clangd.mmap -sync -background-index=0 0:07.44
bin/clangd.volatile -sync -background-index=0 0:07.89
bin/clangd.mmap -sync -background-index=0 0:07.42
bin/clangd.volatile -sync -background-index=0 0:07.50
bin/clangd.mmap -sync -background-index=0 0:07.90
bin/clangd.volatile -sync -background-index=0 0:07.53
bin/clangd.mmap -sync -background-index=0 0:07.64
bin/clangd.volatile -sync -background-index=0 0:07.55
bin/clangd.mmap -sync -background-index=0 0:07.75
bin/clangd.volatile -sync -background-index=0 0:07.47
bin/clangd.mmap -sync -background-index=0 0:07.90
bin/clangd.volatile -sync -background-index=0 0:07.50
bin/clangd.mmap -sync -background-index=0 0:07.81
bin/clangd.volatile -sync -background-index=0 0:07.95
bin/clangd.mmap -sync -background-index=0 0:07.55
bin/clangd.volatile -sync -background-index=0 0:07.65
bin/clangd.mmap -sync -background-index=0 0:08.15
bin/clangd.volatile -sync -background-index=0 0:07.54
bin/clangd.mmap -sync -background-index=0 0:07.78
bin/clangd.volatile -sync -background-index=0 0:07.61
bin/clangd.mmap -sync -background-index=0 0:07.78
bin/clangd.volatile -sync -background-index=0 0:07.55
bin/clangd.mmap -sync -background-index=0 0:07.41
bin/clangd.volatile -sync -background-index=0 0:07.40
bin/clangd.mmap -sync -background-index=0 0:07.54
bin/clangd.volatile -sync -background-index=0 0:07.42
bin/clangd.mmap -sync -background-index=0 0:07.45
bin/clangd.volatile -sync -background-index=0 0:07.49
bin/clangd.mmap -sync -background-index=0 0:07.95
bin/clangd.volatile -sync -background-index=0 0:07.66
bin/clangd.mmap -sync -background-index=0 0:08.04
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73617
Summary:
there is a slight behavior change in this patch:
- before: `in^t a;`, returns our internal error message (no symbol at given location)
- after: `in^t a, returns null, and client displays their message (e.g.
e.g. "the element can't be renamed" in vscode).
both are sensible according LSP, and we'd save one `rename` call in the later case.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73610
Summary:
No need to pass fno-delayed-template-parsing as the opposite flag is
only passed to cc1 when abi is set to msvc. Sending as a follow-up to D73613 to
keep changes in the release branch minimal.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73615
Summary:
This patch adds a simple mechanism to disallow global rename
on std symbols. We might extend it to other symbols, e.g. protobuf.
Reviewers: kadircet
Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73450