llvm-project/llvm/tools
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
..
bugpoint Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
bugpoint-passes Reverted the remainings of c1c9819ef9 2020-02-11 16:20:06 -08:00
dsymutil [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
gold [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
llc [Remarks] Extend the RemarkStreamer to support other emitters 2020-02-04 17:16:02 -08:00
lli Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-ar [llvm-ar] Simplify Windows comparePaths NFCI 2020-02-14 11:20:17 +00:00
llvm-as Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC 2019-08-05 05:43:48 +00:00
llvm-as-fuzzer
llvm-bcanalyzer Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-c-test Move the sysroot attribute from DIModule to DICompileUnit 2020-01-17 12:55:40 -08:00
llvm-cat Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC 2019-08-05 05:43:48 +00:00
llvm-cfi-verify Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-config Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-cov [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
llvm-cvtres Share /machine: handling code with llvm-cvtres too 2019-06-12 11:32:43 +00:00
llvm-cxxdump [llvm/Object] - Make ELFObjectFile::getRelocatedSection return Expected<section_iterator> 2019-10-21 11:06:38 +00:00
llvm-cxxfilt Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-cxxmap Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC 2019-08-05 05:43:48 +00:00
llvm-diff llvm-diff: Perform structural comparison on GlobalVariables, if possible 2019-12-17 14:21:48 -05:00
llvm-dis [llvm-dis] Fix the disable-output flag 2019-11-14 13:35:21 -08:00
llvm-dwarfdump [llvm-dwarfdump][Stats] Fix the License header 2020-02-10 08:01:56 +01:00
llvm-dwp Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-elfabi Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-exegesis [AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI* 2020-02-13 22:08:55 -08:00
llvm-extract [llvm-extract] Add -keep-const-init commandline option 2020-02-03 14:30:28 +09:00
llvm-go Reinstate llvm-go to test the go bindings. 2020-02-13 17:24:55 -08:00
llvm-ifs Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-isel-fuzzer Move CodeGenFileType enum to Support/CodeGen.h 2019-11-13 16:39:34 -08:00
llvm-itanium-demangle-fuzzer
llvm-jitlink Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-jitlistener
llvm-link Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-lipo Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-lto Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-lto2 [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
llvm-mc [AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI* 2020-02-13 22:08:55 -08:00
llvm-mc-assemble-fuzzer Remove AllTargetsAsmPrinters 2020-01-17 19:04:06 -05:00
llvm-mc-disassemble-fuzzer Remove AllTargetsAsmPrinters 2020-01-17 19:04:06 -05:00
llvm-mca [AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI* 2020-02-13 22:08:55 -08:00
llvm-microsoft-demangle-fuzzer
llvm-ml [AsmPrinter][MCStreamer] De-capitalize EmitInstruction and EmitCFI* 2020-02-13 22:08:55 -08:00
llvm-modextract Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC 2019-08-05 05:43:48 +00:00
llvm-mt
llvm-nm Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-objcopy [llvm-objcopy][WebAssembly] Add dump/add/remove-section support 2020-02-11 15:17:18 -08:00
llvm-objdump [llvm-objdump] Print file format in lowercase to match GNU output. 2020-02-12 08:17:01 -08:00
llvm-opt-fuzzer Sink all InitializePasses.h includes 2019-11-13 16:34:37 -08:00
llvm-opt-report Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-pdbutil Use std::foo_t rather than std::foo in LLVM. 2020-02-11 15:12:51 -08:00
llvm-profdata [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
llvm-rc Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-readobj Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
llvm-reduce Revert "[llvm-reduce] add ReduceAttribute delta pass" 2020-02-05 14:15:11 -05:00
llvm-rtdyld Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-shlib [cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" 2019-11-21 10:48:08 -08:00
llvm-size [llvm-size] print a blank line between archieve members when using sysv format 2020-01-03 14:05:55 +08:00
llvm-special-case-list-fuzzer
llvm-split Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC 2019-08-05 05:43:48 +00:00
llvm-stress [llvm] Migrate llvm::make_unique to std::make_unique 2019-08-15 15:54:37 +00:00
llvm-strings [binutils] Add response file option to help and docs 2019-06-21 11:49:20 +00:00
llvm-symbolizer [llvm-symbolizer]Fix printing of malformed address values not passed via stdin 2020-01-08 18:37:41 +08:00
llvm-undname Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
llvm-xray Use std::foo_t rather than std::foo in LLVM. 2020-02-11 15:12:51 -08:00
llvm-yaml-numeric-parser-fuzzer
lto [LTO][Legacy] Add API for passing LLVM options separately 2019-11-19 16:30:37 -08:00
msbuild vs integration: bump version nbr 2019-06-19 07:39:53 +00:00
obj2yaml Revert "Remove redundant "std::move"s in return statements" 2020-02-10 07:07:40 -08:00
opt [Remarks] Extend the RemarkStreamer to support other emitters 2020-02-04 17:16:02 -08:00
opt-viewer [opt viewer] Python compat - decode/encode string 2020-01-29 14:49:24 -08:00
remarks-shlib [Remarks] Add parser for bitstream remarks 2019-09-09 17:43:50 +00:00
sancov [llvm] Replace SmallStr.str().str() with std::string conversion operator. 2020-01-29 21:16:46 -08:00
sanstats
verify-uselistorder Rename F_{None,Text,Append} to OF_{None,Text,Append}. NFC 2019-08-05 05:43:48 +00:00
vfabi-demangle-fuzzer [llvm][VectorUtils] Tweak VFShape for scalable vector functions. 2020-01-30 05:53:56 +00:00
xcode-toolchain
yaml2obj [yaml2obj] Add -D k=v to preprocess the input YAML 2020-02-07 09:35:00 -08:00
CMakeLists.txt Continue removing llgo. 2020-02-10 10:33:58 -08:00
LLVMBuild.txt [llvm-ifs][IFS] llvm Interface Stubs merging + object file generation tool. 2019-08-30 18:26:05 +00:00