Commit Graph

36 Commits

Author SHA1 Message Date
David Tenty 318ed90144 [AIX][llvm][support] Implement getHostCPUName
We implement getHostCPUName() for AIX via systemcfg interfaces since access to the processor version register is a privileged operation. We return a value based on the  current processor implementation mode.

This fixes the cpu detection used by clang for `-mcpu=native`.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D95966
2021-02-09 16:30:18 -05:00
Ryan Houdek 045d84f4e6 D94954: Fixes Snapdragon Kryo CPU core detection
All of these families were claiming to be a73 based, which was causing
-mcpu/mtune=native to never use the newer features available to these
cores.

Goes through each and bumps the individual cores to their respective Big
counterparts. Since this code path doesn't support big.little detection,
there was already a precedent set with the Qualcomm line to choose the
big cores only.

Adds a comment on each line for the product's name that the part number
refers to. Confirmed on-device and through Linux header naming
convections.

Additionally newer SoCs mix CPU implementer parts from multiple
implementers. Both 0x41 (ARM) and 0x51 (Qualcomm) in the Snapdragon case

This was causing a desync in information where the scan at the start to
find the implementer would mismatch the part scan later on.
Now scan for both implementer and part at the start so these stay in
sync.

Differential Revision: https://reviews.llvm.org/D94954
2021-01-20 22:23:43 +00:00
Amara Emerson a1265690cf Fix failing triple test for macOS 11 with non-zero minor versions.
Differential Revision: https://reviews.llvm.org/D94197
2021-01-06 14:57:37 -08:00
Kai Nacke bca1b8ed99 [SystemZ/ZOS] Implement computeHostNumPhysicalCores
On z/OS, the information is stored in the Common System Data Area
(CSD). It is the number of CPs allocated to the current LPAR.

Reviewers: aganea, hubert.reinterpertcast, MaskRay

Reviewed By: hubert.reinterpertcast

Differential Revision: https://reviews.llvm.org/D85531
2020-08-12 08:31:33 -04:00
Ettore Tiotto 36a4f10376 Fix computeHostNumPhysicalCores() for Linux on POWER and Linux on Z
ThinLTO is run using a single thread on Linux on Power. The
compute_thread_count() routine calls getHostNumPhysicalCores which
returns -1 by default, and so `MaxThreadCount is set to 1.

unsigned llvm::ThreadPoolStrategy::compute_thread_count() const {
    int MaxThreadCount = UseHyperThreads
          ? computeHostNumHardwareThreads()
          : sys::getHostNumPhysicalCores();
     if (MaxThreadCount <= 0)
        MaxThreadCount = 1;
   …
}
Fix: provide custom implementation of getHostNumPhysicalCores for
Linux on Power and Linux on Z.

Reviewed By: Kai, uweigand

Differential Revision: https://reviews.llvm.org/D84764
2020-07-30 18:05:36 +00:00
Sjoerd Meijer 5ecf85a5fc [AArch64] Add native CPU detection for Neoverse N1
Map the CPU ID value 0xd0c to "neoverse-n1".

Patch by James Greenhalgh.

Differential Revision: https://reviews.llvm.org/D80736
2020-05-28 19:54:18 +01:00
Fangrui Song f7f8c1cd9a [Support][unittest] Fix HostTest.NumPhysicalCores on __i386__ after D78324 2020-05-19 21:51:14 -07:00
Raul Tambre 0863e94ebd [AArch64] Add NVIDIA Carmel support
Summary:
NVIDIA's Carmel ARM64 cores are used in Tegra194 chips found in Jetson AGX Xavier, DRIVE AGX Xavier and DRIVE AGX Pegasus.

References:
* https://devblogs.nvidia.com/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/#h.huq9xtg75a5e
* NVIDIA Xavier Series System-on-Chip Technical Reference Manual 1.3 (https://developer.nvidia.com/embedded/downloads#?search=Xavier%20Series%20SoC%20Technical%20Reference%20Manual)

Reviewers: sdesmalen, paquette

Reviewed By: sdesmalen

Subscribers: llvm-commits, ianshmean, kristof.beyls, hiraditya, jfb, danielkiss, cfe-commits, t.p.northover

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D77940
2020-05-04 13:52:30 +01:00
KAWASHIMA Takahiro c8cd1a994d [AArch64] Add support for Fujitsu A64FX
A64FX is an Armv8.2-A CPU used in FUJITSU Supercomputer
PRIMEHPC FX1000, PRIMEHPC FX700, and supercomputer Fugaku.

https://www.fujitsu.com/global/products/computing/servers/supercomputer/specifications/

Differential Revision: https://reviews.llvm.org/D75594
2020-03-09 19:15:09 +09:00
Amy Huang cb36bfa3de Fix 01b02a73de to use correct macro spelling and fix unit tests. 2020-02-14 15:58:36 -08:00
Alexandre Ganea 8404aeb56a [Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.

== Background ==
Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.

By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.

This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.

== The problem ==
The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group".

== The changes in this patch ==
To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).

When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead.
The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.

When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.

Differential Revision: https://reviews.llvm.org/D71775
2020-02-14 10:24:22 -05:00
Evandro Menezes 215da6606c [clang][llvm] Obsolete Exynos M1 and M2 2019-10-30 15:02:59 -05:00
Jonas Devlieghere 0eaee545ee [llvm] Migrate llvm::make_unique to std::make_unique
Now that we've moved to C++14, we no longer need the llvm::make_unique
implementation from STLExtras.h. This patch is a mechanical replacement
of (hopefully) all the llvm::make_unique instances across the monorepo.

llvm-svn: 369013
2019-08-15 15:54:37 +00:00
Pavel Labath 0d802a4923 Revert "raw_ostream: add operator<< overload for std::error_code"
This reverts commit r368849, because it breaks some bots (e.g.
llvm-clang-x86_64-win-fast).

It turns out this is not as NFC as we had hoped, because operator== will
consider two std::error_codes to be distinct even though they both hold
"success" values if they have different categories.

llvm-svn: 368854
2019-08-14 13:59:04 +00:00
Pavel Labath 40837e97b1 raw_ostream: add operator<< overload for std::error_code
Summary:
The main motivation for this is unit tests, which contain a large macro
for pretty-printing std::error_code, and this macro is duplicated in
every file that needs to do this. However, the functionality may be
useful elsewhere too.

In this patch I have reimplemented the existing ASSERT_NO_ERROR macros
to reuse the new functionality, but I have kept the macro (as a
one-liner) as it is slightly more readable than ASSERT_EQ(...,
std::error_code()).

Reviewers: sammccall, ilya-biryukov

Subscribers: zturner, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65643

llvm-svn: 368849
2019-08-14 13:33:28 +00:00
Simon Pilgrim 8eacea80ad Appease MSVC builds by #ifdef wrapping runAndGetCommandOutput tests. NFCI.
llvm-svn: 356042
2019-03-13 11:51:13 +00:00
Hubert Tong 72db2abcc7 Use AIX version detection at LLVM run-time
Summary:
AIX compilers define macros based on the version of the operating
system.

This patch implements updating of versionless AIX triples to include the
host AIX version. Also, the host triple detection in the build system is
adjusted to strip the AIX version information so that the run-time
detection is preferred.

Reviewers: xingxue, stefanp, nemanjai, jasonliu

Reviewed By: xingxue

Subscribers: mgorny, kristina, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58798

llvm-svn: 355995
2019-03-13 00:12:43 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Bryan Chan 123553921f [AArch64] Support HiSilicon's TSV110 processor
Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls

Reviewed By: kristof.beyls

Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D53908

llvm-svn: 346546
2018-11-09 19:32:08 +00:00
Joel Jones 0a6c000c16 [AArch64] -mcpu=native CPU detection for Cavium processors
This small patch updates the CPU detection for Cavium processors when
-mcpu=native is passed on compile-line.

Patch by Stefan Teleman
Differential Revision: https://reviews.llvm.org/D51939

llvm-svn: 343897
2018-10-05 22:23:21 +00:00
Zachary Turner 08426e1f9f Refactor ExecuteAndWait to take StringRefs.
This simplifies some code which had StringRefs to begin with, and
makes other code more complicated which had const char* to begin
with.

In the end, I think this makes for a more idiomatic and platform
agnostic API.  Not all platforms launch process with null terminated
c-string arrays for the environment pointer and argv, but the api
was designed that way because it allowed easy pass-through for
posix-based platforms.  There's a little additional overhead now
since on posix based platforms we'll be takign StringRefs which
were constructed from null terminated strings and then copying
them to null terminate them again, but from a readability and
usability standpoint of the API user, I think this API signature
is strictly better.

llvm-svn: 334518
2018-06-12 17:43:52 +00:00
Evandro Menezes 5d7a9e6e54 [AArch64] Add Exynos to host detection
Differential revision: https://reviews.llvm.org/D40985

llvm-svn: 320195
2017-12-08 21:09:59 +00:00
Chad Rosier 71070856e6 [AArch64] Add basic support for Qualcomm's Saphira CPU.
llvm-svn: 314105
2017-09-25 14:05:00 +00:00
Balaram Makam a1e7ecc734 [Falkor] Add falkor CPU to host detection
This returns "falkor" for Falkor CPU.

llvm-svn: 313998
2017-09-22 17:46:36 +00:00
Eli Friedman bde9fc76dd [ARM] Add more CPUs to host detection
This returns "cortex-a73" for second-generation Kryo; not precisely
correct, but close enough.

Differential Revision: https://reviews.llvm.org/D37724

llvm-svn: 313200
2017-09-13 21:48:00 +00:00
Vedant Kumar bc26f72655 [unittests] Fix up test after rL313156
Bot: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/42421
llvm-svn: 313163
2017-09-13 18:00:22 +00:00
Alex Lorenz 3803df3dcd [Support] sys::getProcessTriple should return a macOS triple using
the system's version of macOS

sys::getProcessTriple returns LLVM_HOST_TRIPLE, whose system version might not
be the actual version of the system on which the compiler running. This commit
ensures that, for macOS, sys::getProcessTriple returns a triple with the
system's macOS version.

rdar://33177551

Differential Revision: https://reviews.llvm.org/D34446

llvm-svn: 307372
2017-07-07 09:53:47 +00:00
Yi Kong 57019dc9b2 Implement host CPU detection for AArch64
This shares detection logic with ARM(32), since AArch64 capable CPUs may
also run in 32-bit system mode.

We observe weird /proc/cpuinfo output for MSM8992 and MSM8994, where
they report all CPU cores as one single model, depending on which CPU
core the kernel is running on. As a workaround, we hardcode the known
CPU part name for these SoCs.

For big.LITTLE systems, this patch would only return the part name of
the first core (usually the little core). Proper support will be added
in a follow-up change.

Differential Revision: D31675

llvm-svn: 299458
2017-04-04 19:06:04 +00:00
Kristof Beyls 77ce4f6e37 Make naming in Host.h in line with coding standards.
Based on post-commit review comments by Chandler Carruth on
https://reviews.llvm.org/D31236. Thanks!

llvm-svn: 299211
2017-03-31 13:06:40 +00:00
Kristof Beyls 7a76b315d6 Revert "Make naming in Host.h in line with coding standards."
This reverts r299062, which caused build failures on Windows.
It also reverts the attempts to fix the windows builds in r299064 and r299065.
The introduction of namespace llvm::sys::detail makes MSVC, and seemingly also
mingw, complain about ambiguity with the existing namespace llvm::detail.
E.g.:
C:\b\slave\sanitizer-windows\llvm\include\llvm/Support/MathExtras.h(184): error C2872: 'detail': ambiguous symbol
C:\b\slave\sanitizer-windows\llvm\include\llvm/Support/PointerLikeTypeTraits.h(31): note: could be 'llvm::detail'
C:\b\slave\sanitizer-windows\llvm\include\llvm/Support/Host.h(80): note: or       'llvm::sys::detail'

In r299064 and r299065 I tried to fix these ambiguities, based on the errors
reported in the log files. It seems however that the build stops early when
this kind of error is encountered, and many build-then-fix-iterations on
Windows may be needed to fix this. Therefore reverting r299062 for now to
get the build working again on Windows.

llvm-svn: 299066
2017-03-30 11:06:25 +00:00
Kristof Beyls ca878c943b Make naming in Host.h in line with coding standards.
Based on post-commit review comments by Chandler Carruth on
https://reviews.llvm.org/D31236. Thanks!

llvm-svn: 299062
2017-03-30 09:31:59 +00:00
Kristof Beyls 9e46396ecc Refactor getHostCPUName to allow testing on non-native hardware.
This refactors getHostCPUName so that for the architectures that get the
host cpu info on linux from /proc/cpuinfo, the /proc/cpuinfo parsing
logic is present in the build, even if it wasn't built on a linux system
for that architecture.

Since the code is present in the build, we can then test that code also
on other systems, i.e. we don't need to have buildbots setup for all
architectures on linux to be able to test this. Instead, developers will
test this as part of the regression test run.

As an example, a few unit tests are added to test getHostCPUName for ARM
running linux. A unit test is preferred over a lit-based test, since the
expectation is that in the future, the functionality here will grow over
what can be tested with "llc -mcpu=native".

This is a preparation step to enable implementing the range of
improvements discussed on PR30516, such as adding AArch64 support,
support for big.LITTLE systems, reducing code duplication.

Differential Revision: https://reviews.llvm.org/D31236

llvm-svn: 299060
2017-03-30 07:24:49 +00:00
Ahmed Bougacha f0fe1a87fe [Support] Simplify triple check in Host CPU test. NFC.
Cleanup the check added in r293990 using the Triple helpers.

llvm-svn: 294073
2017-02-04 00:46:59 +00:00
Ahmed Bougacha f60b68457e [Support] Accept macosx triple as 'darwin' in Host unittest. NFC.
If LLVM was configured with an x86_64-apple-macosx host triple, this
test would fail, as the API works but the triple isn't in the whitelist.

llvm-svn: 293990
2017-02-03 01:32:39 +00:00
Mehdi Amini db46b7d217 Add computeHostNumPhysicalCores() implementation for Darwin
Differential Revision: https://reviews.llvm.org/D25800

llvm-svn: 284656
2016-10-19 22:36:07 +00:00
Teresa Johnson 7943fecee8 Add interface to compute number of physical cores on host system
Summary:
For now I have only added support for x86_64 Linux, but other systems
can be added incrementally.

This is to be used for setting the default parallelism for ThinLTO
backends (instead of thread::hardware_concurrency which includes
hyperthreading and is too aggressive). I'll send this as a follow-on
patch, and it will fall back to hardware_concurrency when the new
getHostNumPhysicalCores returns -1 (when not supported for a given
host system).

I also added an interface to MemoryBuffer to force reading a file
as a stream - this is required for /proc/cpuinfo which is a special
file that looks like a normal file but appears to have 0 size.
The existing readers of this file in Host.cpp are reading the first
1024 or so bytes from it, because the necessary info is near the top.
But for the new functionality we need to be able to read the entire
file. I can go back and change the other readers to use the new
getFileAsStream as a follow-on patch since it seems much more robust.

Added a unittest.

Reviewers: mehdi_amini

Subscribers: beanz, mgorny, llvm-commits, modocache

Differential Revision: https://reviews.llvm.org/D25564

llvm-svn: 284138
2016-10-13 17:43:20 +00:00