llvm-project/llvm/unittests/Support
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
..
DynamicLibrary Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
ARMAttributeParser.cpp [ARM] add target arch definitions for 8.1-M and MVE 2019-05-30 12:57:04 +00:00
AlignOfTest.cpp [NFC] Remove LLVM_ALIGNAS 2019-07-31 03:22:08 +00:00
AlignmentTest.cpp [Alignment][NFC] Deprecate Align::None() 2020-01-24 12:53:58 +01:00
AllocatorTest.cpp Reland "[llvm] Add a way to speed up the speed in which BumpPtrAllocator increases slab sizes"" 2020-02-03 12:06:15 +01:00
AnnotationsTest.cpp
ArrayRecyclerTest.cpp
BinaryStreamTest.cpp [Support] Split MallocAllocator out of Allocator.h 2020-01-24 17:29:32 -08:00
BlockFrequencyTest.cpp
BranchProbabilityTest.cpp
CMakeLists.txt Revert "Disable exit-on-SIGPIPE in lldb" 2019-10-24 13:19:49 -07:00
CRCTest.cpp Make llvm::crc32() work also for input sizes larger than 32 bits. 2020-02-05 21:32:11 +01:00
CachePruningTest.cpp
Casting.cpp [llvm] Migrate llvm::make_unique to std::make_unique 2019-08-15 15:54:37 +00:00
CheckedArithmeticTest.cpp
Chrono.cpp
CommandLineTest.cpp [CommandLine] Add missing Callbacks 2019-12-09 11:37:34 +00:00
CompressionTest.cpp build: reduce CMake handling for zlib 2020-01-02 11:19:12 -08:00
ConvertUTFTest.cpp
CrashRecoveryTest.cpp [Support] Optionally call signal handlers when a function wrapped by the the CrashRecoveryContext fails 2020-01-11 15:27:07 -05:00
DJBTest.cpp
DataExtractorTest.cpp [DebugInfo][Support] Replace DWARFDataExtractor size function 2020-01-13 10:53:00 +00:00
DebugCounterTest.cpp
DebugTest.cpp
EndianStreamTest.cpp
EndianTest.cpp
ErrnoTest.cpp
ErrorOrTest.cpp
ErrorTest.cpp [Error] Make llvm::cantFail include the original error messages 2019-10-17 21:54:15 +00:00
FileCheckTest.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
FileCollectorTest.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
FileOutputBufferTest.cpp [LLD][ELF] Support --[no-]mmap-output-file with F_no_mmap 2019-10-29 15:49:08 -07:00
FileUtilitiesTest.cpp [llvm] Replace SmallStr.str().str() with std::string conversion operator. 2020-01-29 21:16:46 -08:00
FormatVariadicTest.cpp Use C++14-style return type deduction in LLVM. 2020-02-11 07:38:42 -08:00
GlobPatternTest.cpp Reapply r375051: [support] GlobPattern: add support for `\` and `[!...]`, and allow `]` in more places 2019-10-17 18:09:05 +00:00
Host.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
ItaniumManglingCanonicalizerTest.cpp llvm-cxxmap: fix support for remapping non-mangled names. 2019-12-18 10:47:02 -08:00
JSONTest.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
KnownBitsTest.cpp [NFC][KnownBits] Add getMinValue() / getMaxValue() methods 2019-12-03 20:04:51 +03:00
LEB128Test.cpp
LineIteratorTest.cpp
LockFileManagerTest.cpp
MD5Test.cpp
ManagedStatic.cpp
MatchersTest.cpp
MathExtrasTest.cpp [Alignment][NFC] Support compile time constants 2019-10-14 09:04:15 +00:00
MemoryBufferTest.cpp [Support] Improve readNativeFile(Slice) interface 2019-08-22 08:13:30 +00:00
MemoryTest.cpp [Support] Renamed member 'Size' to 'AllocatedSize' in MemoryBlock and OwningMemoryBlock. 2019-05-20 20:53:05 +00:00
NativeFormatTests.cpp
ParallelTest.cpp
Path.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
ProcessTest.cpp [Support][NFC] Add an explicit unit test for Process::getPageSize() 2020-01-09 18:14:05 +00:00
ProgramTest.cpp Revert "raw_ostream: add operator<< overload for std::error_code" 2019-08-14 13:59:04 +00:00
RegexTest.cpp
ReplaceFileTest.cpp Revert "raw_ostream: add operator<< overload for std::error_code" 2019-08-14 13:59:04 +00:00
ReverseIterationTest.cpp [NFC] Fixes -Wrange-loop-analysis warnings 2020-01-07 00:51:41 +01:00
ScaledNumberTest.cpp
SourceMgrTest.cpp
SpecialCaseListTest.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
StringPool.cpp
SwapByteOrderTest.cpp
SymbolRemappingReaderTest.cpp
TarWriterTest.cpp
TargetParserTest.cpp [ARM][TargetParser] Improve handling of dependencies between target features 2020-02-05 16:07:51 +00:00
TaskQueueTest.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
ThreadLocalTest.cpp
ThreadPool.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
Threading.cpp [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups 2020-02-14 10:24:22 -05:00
TimerTest.cpp
TrailingObjectsTest.cpp [Alignment][NFC] Move and type functions from MathExtras to Alignment 2019-10-14 13:14:34 +00:00
TrigramIndexTest.cpp [llvm] Migrate llvm::make_unique to std::make_unique 2019-08-15 15:54:37 +00:00
TypeNameTest.cpp
TypeTraitsTest.cpp Fix compilation warnings when compiling with GCC 7.3 2019-05-06 13:41:54 +00:00
UnicodeTest.cpp
VersionTupleTest.cpp
VirtualFileSystemTest.cpp [VFS] More consistent support for Windows 2020-02-05 11:38:20 -08:00
YAMLIOTest.cpp YAML parser robustness improvements 2019-11-05 21:51:04 -08:00
YAMLParserTest.cpp Fix null dereference in yaml::Document::skip 2019-11-11 20:48:28 -08:00
formatted_raw_ostream_test.cpp
raw_ostream_test.cpp Make llvm::StringRef to std::string conversions explicit. 2020-01-28 23:25:25 +01:00
raw_pwrite_stream_test.cpp Revert "raw_ostream: add operator<< overload for std::error_code" 2019-08-14 13:59:04 +00:00
raw_sha1_ostream_test.cpp [Support] Optimize SHA1 implementation 2019-11-11 22:14:28 -08:00
xxhashTest.cpp