llvm-project

Commit Graph

Author	SHA1	Message	Date
Sam McCall	4d006520b8	[clangd] Clean up unused includes. NFCI Add includes where needed to fix build. Haven't systematically added used headers, so there is still accidental dependency on transitive includes.	2022-02-26 12:00:16 +01:00
Christian Kühnel	5bd643d31d	[clangd] cleanup of header guard names Renaming header guards to match the LLVM convention. This patch was created by automatically applying the fixes from clang-tidy. I've removed the [NFC] tag from the title, as we're adding header guards in some files and thus might trigger behavior changes. Differential Revision: https://reviews.llvm.org/D113896	2021-12-02 15:58:35 +00:00
Sam McCall	3c5745cb1f	[clangd] Make background index thread count calculation clearer Summary: This confusion was inadvertently introduced in a change to the heavyweight_hardware_concurrency API: `8404aeb56a` - don't indirect through the rebuilder policy when building the thread pool - document that rebuilder thresholds are exposed for testing only - don't use 0 as a sentinel value for "all threads", as we use it as a sentinel value for "synchronous" (though unsupported for BackgroundIndex) - rather than pick some new sentinel value, just always use 4 threads for tests Reviewers: kadircet, aganea Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D82352	2020-06-25 00:18:53 +02:00
Alexandre Ganea	8404aeb56a	[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket. == Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads. By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to. This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market. == The problem == The heavyweight_hardware_concurrency() API was introduced so that only one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group". == The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO). When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware core will be used. When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once. Differential Revision: https://reviews.llvm.org/D71775	2020-02-14 10:24:22 -05:00
Kadir Cetinkaya	91e5f4b46b	Revert "Revert r366458, r366467 and r366468" This reverts commit `9c377105da`. [clangd][BackgroundIndexLoader] Directly store DependentTU while loading shard Summary: We were deferring the population of DependentTU field in LoadedShard until BackgroundIndexLoader was consumed. This actually triggers a use after free since the shards FileToTU was pointing at could've been moved while consuming the Loader. Reviewers: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D64980 llvm-svn: 366559	2019-07-19 10:18:52 +00:00
Azharuddin Mohammed	9c377105da	Revert r366458, r366467 and r366468 r366458 is causing test failures. r366467 and r366468 had to be reverted as they were casuing conflict while reverting r366458. r366468 [clangd] Remove dead code from BackgroundIndex r366467 [clangd] BackgroundIndex stores shards to the closest project r366458 [clangd] Refactor background-index shard loading llvm-svn: 366551	2019-07-19 09:26:33 +00:00
Kadir Cetinkaya	40073f922a	[clangd] Refactor background-index shard loading Reviewers: sammccall Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D64712 llvm-svn: 366458	2019-07-18 16:25:36 +00:00
Sam McCall	06377ae2e5	[clangd] Don't rebuild background index until we indexed one TU per thread. Summary: This increases the odds that the boosted file (cpp file matching header) will be ready. (It always enqueues first, so it'll be present unless another thread indexes two files before the first thread indexes one.) Reviewers: kadircet Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64682 llvm-svn: 366199	2019-07-16 10:17:06 +00:00
Sam McCall	2f760c44e6	[clangd] Rewrite of logic to rebuild the background index serving structures. Summary: Previously it was rebuilding every 5s by default, which was much too frequent in the long run - the goal was to provide an early build. There were also some bugs. There were also some bugs, and a dedicated thread was used in production but not tested. - rebuilds are triggered by #TUs built, rather than time. This should scale more sensibly to fast vs slow machines. - there are two separate indexed-TU thresholds to trigger index build: 5 TUs for the first build, 100 for subsequent rebuilds. - rebuild is always done on the regular indexing threads, and is affected by blockUntilIdle. This means unit/lit tests run the production configuration. - fixed a bug where we'd rebuild after attempting to load shards, even if there were no shards. - the BackgroundIndexTests don't really test the subtleties of the rebuild policy (for determinism, we call blockUntilIdle, so rebuild-on-idle is enough to pass the tests). Instead, we expose the rebuilder as a separate class and have fine-grained tests for it. Reviewers: kadircet Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64291 llvm-svn: 365531	2019-07-09 18:30:49 +00:00

9 Commits