llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	6f23049119	[Support] Simplify and optimize ThreadPool * Merge QueueLock and CompletionLock. * Avoid spurious CompletionCondition.notify_all() when ActiveThreads is greater than 0. * Use default member initializers. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D78856	2020-04-28 12:20:42 -07:00
Alexandre Ganea	8404aeb56a	[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket. == Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads. By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to. This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market. == The problem == The heavyweight_hardware_concurrency() API was introduced so that only one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group". == The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO). When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware core will be used. When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once. Differential Revision: https://reviews.llvm.org/D71775	2020-02-14 10:24:22 -05:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Zachary Turner	9b8b0794b8	Revert "Enable ThreadPool to queue tasks that return values." This is failing to compile when LLVM_ENABLE_THREADS is false, and the fix is not immediately obvious, so reverting while I look into it. llvm-svn: 334658	2018-06-13 21:24:19 +00:00
Zachary Turner	1b76a128a8	Enable ThreadPool to support tasks that return values. Previously ThreadPool could only queue async "jobs", i.e. work that was done for its side effects and not for its result. It's useful occasionally to queue async work that returns a value. From an API perspective, this is very intuitive. The previous API just returned a shared_future<void>, so all we need to do is make it return a shared_future<T>, where T is the type of value that the operation returns. Making this work required a little magic, but ultimately it's not too bad. Instead of keeping a shared queue<packaged_task<void()>> we just keep a shared queue<unique_ptr<TaskBase>>, where TaskBase is a class with a pure virtual execute() method, then have a templated derived class that stores a packaged_task<T()>. Everything else works out pretty cleanly. Differential Revision: https://reviews.llvm.org/D48115 llvm-svn: 334643	2018-06-13 19:29:16 +00:00
Hans Wennborg	86f0b70f37	Speculative build fix for lld on Linux after Michael's #include removals llvm-svn: 320645	2017-12-13 22:12:57 +00:00
Michael Zolotukhin	da9f402677	Remove redundant includes from lib/Support. llvm-svn: 320627	2017-12-13 21:30:58 +00:00
Jan Korous	c723f65709	[Support] Fix locking of shared variable in threadpool llvm-svn: 319027	2017-11-27 13:42:03 +00:00
Rafael Espindola	8c0ff9508d	Bring r314809 back. But now include a check for CPU_COUNT so we still build on 10 year old versions of glibc. Original message: Use sched_getaffinity instead of std:🧵:hardware_concurrency. The issue with std:🧵:hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration. With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores. This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example. llvm-svn: 314931	2017-10-04 20:27:01 +00:00
Daniel Neilson	bef94bcbae	Revert D38481 due to missing cmake check for CPU_COUNT Summary: This reverts D38481. The change breaks systems with older versions of glibc. It injects a use of CPU_COUNT() from sched.h without checking to ensure that the function exists first. Reviewers: Subscribers: llvm-svn: 314922	2017-10-04 18:19:03 +00:00
Rafael Espindola	6e182fbab4	Use sched_getaffinity instead of std:🧵:hardware_concurrency. The issue with std:🧵:hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration. With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores. This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example. llvm-svn: 314809	2017-10-03 16:25:15 +00:00
Peter Collingbourne	b78a68db7b	Support: Remove MSVC 2013 workarounds in ThreadPool class. I have confirmed that these are no longer needed with MSVC 2015. Differential Revision: https://reviews.llvm.org/D34187 llvm-svn: 305347	2017-06-14 00:36:21 +00:00
Davide Italiano	0f0d5d8f8d	[ThreadPool] Rollback recent changes until I figure out the breakage. llvm-svn: 288018	2016-11-28 09:17:12 +00:00
Davide Italiano	3ea0bfa7e0	[ThreadPool] Simplify the interface. NFCI. The callers don't use the return value. Found by Michael Spencer. llvm-svn: 288016	2016-11-28 08:53:41 +00:00
Jason Henline	703788373a	Removing whitespace from test commit rL273447 Undoing the trivial change I introduced in rL273447. llvm-svn: 273449	2016-06-22 18:01:11 +00:00
Jason Henline	4fe43f9b4a	Add whitespace to check commit access No functional changes. Just adding whitespace in a comment in order to check that I am able to push a commit to the repo. llvm-svn: 273447	2016-06-22 17:40:02 +00:00
Justin Lebar	9e479e4763	Fix a race condition in support library ThreadPool. By running TSAN on the ThreadPool unit tests it was discovered that the threads in the pool can pop tasks off the queue at the same time the "wait" routine is trying to check if the task queue is empty. This patch fixes this problem by checking for active threads in the waiter before checking whether the queue is empty. Patch by Jason Henline. Differential Revision: http://reviews.llvm.org/D18811 Reviewers: joker.eph, jlebar llvm-svn: 265618	2016-04-06 23:46:40 +00:00
Mehdi Amini	bebca1c496	Fix MSVC build with LLVM_ENABLE_THREADS=OFF Follow-up to the ThreadPool implementation. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255621	2015-12-15 05:53:41 +00:00
Mehdi Amini	33a7ea4b9a	Add a C++11 ThreadPool implementation in LLVM This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. This is a recommit of r255444 ; trying to workaround a bug in the MSVC 2013 standard library. I think I was hit by: http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable Recommit of r255589, trying to please g++ as well. Differential Revision: http://reviews.llvm.org/D15464 From: mehdi_amini <mehdi_amini@91177308-0d34-0410-b5e6-96231b3b80d8> llvm-svn: 255593	2015-12-15 00:59:19 +00:00
Mehdi Amini	2bc6a5ad84	Revert "Add a C++11 ThreadPool implementation in LLVM" This reverts commit r255589. Breaks g++ From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255591	2015-12-15 00:42:44 +00:00
Mehdi Amini	ef0ef2860d	Add a C++11 ThreadPool implementation in LLVM This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. This is a recommit of r255444 ; trying to workaround a bug in the MSVC 2013 standard library. I think I was hit by: http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable Differential Revision: http://reviews.llvm.org/D15464 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255589	2015-12-15 00:38:05 +00:00
Nico Weber	c2a687b6a6	Revert r255444. It doesn't build on Windows and broke the Windows LLD and LLDB bots: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/27693/steps/build_Lld/logs/stdio http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc/builds/13468/steps/build/logs/stdio llvm-svn: 255446	2015-12-13 04:14:39 +00:00
Mehdi Amini	396abbb6f0	Add a C++11 ThreadPool implementation in LLVM This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. Differential Revision: http://reviews.llvm.org/D15464 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255444	2015-12-12 22:55:25 +00:00

23 Commits