llvm-project

Commit Graph

Author	SHA1	Message	Date
Xiang1 Zhang	aded4f0cc0	[X86-64] Support Intel AMX instructions Summary: INTEL ADVANCED MATRIX EXTENSIONS (AMX). AMX is a new programming paradigm, it has a set of 2-dimensional registers (TILES) representing sub-arrays from a larger 2-dimensional memory image and operate on TILES. Spec can be found in Chapter 3 here https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewers: LuoYuanke, annita.zhang, pengfei, RKSimon, xiangzhangllvm Reviewed By: xiangzhangllvm Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D82705	2020-07-02 08:57:04 +08:00
Craig Topper	23654d9e7a	Recommit "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum." Hopefully this version will fix the previously buildbot failure	2020-06-22 13:32:03 -07:00
Craig Topper	bebea4221d	Revert "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum." Seems to breaking build. This reverts commit `5ac144fe64`.	2020-06-22 12:20:40 -07:00
Craig Topper	5ac144fe64	[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum. Move 0 initialization up to the caller so we don't need to know the size.	2020-06-22 11:46:20 -07:00
Craig Topper	2831f7852f	[X86] Remove brand_id check from getHostCPUName. Brand index was a feature some Pentium III and Pentium 4 CPUs. It provided an index into a software lookup table to provide a brand name for the CPU. This is separate from the family/model. It's unclear to me why this index being non-zero was used to block checking family/model. I think the effect of this is that -march=native was not working correctly on the CPUs that have a non-zero brand index. They are all about 20 years old so this probably hasn't affected many users.	2020-06-12 20:38:30 -07:00
Craig Topper	a27d0dcf65	[X86] Combine the three feature variables in getHostCPUName into an array and pass it around as an array reference. This makes the setting and clearing of bits simpler.	2020-06-12 18:30:41 -07:00
Craig Topper	0ce9bf6eed	[X86] Add a helper lambda to getIntelProcessorTypeAndSubtype to select feature bits from the correct 32-bit feature variable. We have three 32 bit variables containing feature bits. But our enum is a flat 96 bit space. So we need to pick which of the variables to use based on the bit value. We used to do this manually by mentioning the correct variable and subtracting an offset from the enum. But this is error prone.	2020-06-11 21:14:46 -07:00
Craig Topper	c525168190	[X86] Remove unnecessary #if around call to isCpuIdSupported in getHostCPUName. The exact same #if is already inside isCpuIdSupported and causes it to return true. The definition of isCpuIdSupported isn't conditional so we should be able just rely on its body doing the right thing.	2020-06-11 15:13:28 -07:00
Craig Topper	ed34140e11	[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC	2020-06-10 22:06:34 -07:00
Craig Topper	ba8d182597	Revert "[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC" This reverts commit `874800b4f7`. Forgot to update the clang includes	2020-06-10 21:24:44 -07:00
Craig Topper	874800b4f7	[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC	2020-06-10 21:18:32 -07:00
Sjoerd Meijer	5ecf85a5fc	[AArch64] Add native CPU detection for Neoverse N1 Map the CPU ID value 0xd0c to "neoverse-n1". Patch by James Greenhalgh. Differential Revision: https://reviews.llvm.org/D80736	2020-05-28 19:54:18 +01:00
Craig Topper	69ede516c7	[X86] Add 'avx512vp2intersect' to getHostCPUFeatures.	2020-05-28 09:57:17 -07:00
Lei Huang	2368bf52cd	[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm Summary: This patch simply adds support for the new CPU in anticipation of Power10. There isn't really any functionality added so there are no associated test cases at this time. Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc Reviewed By: stefanp, nemanjai, amyk, #powerpc Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo Tags: #clang, #powerpc, #llvm Differential Revision: https://reviews.llvm.org/D80020	2020-05-27 13:14:25 -05:00
Lei Huang	559845f8fe	Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm" This reverts commit `7eb666b155`.	2020-05-27 09:40:21 -05:00
Lei Huang	7eb666b155	[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm Summary: This patch simply adds support for the new CPU in anticipation of Power10. There isn't really any functionality added so there are no associated test cases at this time. Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc Reviewed By: stefanp, nemanjai, amyk, #powerpc Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo Tags: #clang, #powerpc, #llvm Differential Revision: https://reviews.llvm.org/D80020	2020-05-26 13:48:22 -05:00
Benjamin Kramer	82bee922af	Make FEATURE_AVX512VP2INTERSECT match between compiler-rt and LLVM compiler-rt also doesn't support bits >= 64 as far as I know.	2020-05-25 15:18:04 +02:00
Craig Topper	2bb822bc90	[X86] Add family/model for Intel Comet Lake CPUs for -march=native and function multiversioning This adds the family/model returned by CPUID for some Intel Comet Lake CPUs. Instruction set and tuning wise these are the same as "skylake". These are not in the Intel SDM yet, but these should be correct.	2020-05-24 00:29:25 -07:00
Raul Tambre	0863e94ebd	[AArch64] Add NVIDIA Carmel support Summary: NVIDIA's Carmel ARM64 cores are used in Tegra194 chips found in Jetson AGX Xavier, DRIVE AGX Xavier and DRIVE AGX Pegasus. References: * https://devblogs.nvidia.com/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/#h.huq9xtg75a5e * NVIDIA Xavier Series System-on-Chip Technical Reference Manual 1.3 (https://developer.nvidia.com/embedded/downloads#?search=Xavier%20Series%20SoC%20Technical%20Reference%20Manual) Reviewers: sdesmalen, paquette Reviewed By: sdesmalen Subscribers: llvm-commits, ianshmean, kristof.beyls, hiraditya, jfb, danielkiss, cfe-commits, t.p.northover Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D77940	2020-05-04 13:52:30 +01:00
Fangrui Song	fce115681b	[Support][X86] Include sched.h after D78324 http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/28848/steps/build%20stage%201/logs/stdio	2020-04-17 08:46:27 -07:00
Fangrui Song	d441188c15	[Support][X86] Change getHostNumPhsicalCores() to return number of physical cores enabled by affinity Fixes https://bugs.llvm.org/show_bug.cgi?id=45556 While here, make the x86-64 code available for x86-32. The output has been available and stable since https://git.kernel.org/linus/3dd9d514846cdca1dcef2e4fce666d85e199e844 (2005) ``` processor: ... physical id: siblings: core id: ``` Don't check HAVE_SCHED_GETAFFINITY/HAVE_CPU_COUNT. The interface is simply available in every libc which can build LLVM. Reviewed By: aganea Differential Revision: https://reviews.llvm.org/D78324	2020-04-17 07:45:04 -07:00
WangTianQing	a3dc949000	[X86] Add TSXLDTRK instructions. Summary: For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77205	2020-04-09 13:17:29 +08:00
WangTianQing	d08fadd662	[X86] Add SERIALIZE instruction. Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77193	2020-04-02 16:19:23 +08:00
Reid Kleckner	47359fbd2e	Drop a StringMap.h include, NFC $ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \ \| grep '^[-+] ' \| sort \| uniq -c \| sort -nr 231 - llvm/include/llvm/ADT/StringMap.h 171 - llvm/include/llvm/Support/AllocatorBase.h 142 - llvm/include/llvm/Support/PointerLikeTypeTraits.h	2020-03-11 15:45:34 -07:00
KAWASHIMA Takahiro	c8cd1a994d	[AArch64] Add support for Fujitsu A64FX A64FX is an Armv8.2-A CPU used in FUJITSU Supercomputer PRIMEHPC FX1000, PRIMEHPC FX700, and supercomputer Fugaku. https://www.fujitsu.com/global/products/computing/servers/supercomputer/specifications/ Differential Revision: https://reviews.llvm.org/D75594	2020-03-09 19:15:09 +09:00
Luke Geeson	7d594cf003	[ARM] Add Cortex-M55 Support for clang and llvm This patch upstreams support for the ARM Armv8.1m cpu Cortex-M55. In detail adding support for: - mcpu option in clang - Arm Target Features in clang - llvm Arm TargetParser definitions details of the CPU can be found here: https://developer.arm.com/ip-products/processors/cortex-m/cortex-m55 Reviewers: chill Reviewed By: chill Subscribers: dmgreen, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D74966	2020-03-02 11:42:26 +00:00
Luke Geeson	4518aab289	[AArch64] Add Cortex-A34 Support for clang and llvm This patch upstreams support for the AArch64 Armv8-A cpu Cortex-A34. In detail adding support for: - mcpu option in clang - AArch64 Target Features in clang - llvm AArch64 TargetParser definitions details of the cpu can be found here: https://developer.arm.com/ip-products/processors/cortex-a/cortex-a34 Reviewers: SjoerdMeijer Reviewed By: SjoerdMeijer Subscribers: SjoerdMeijer, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D74483 Change-Id: Ida101fc544ca183a0a0e61a1277c8957855fde0b	2020-02-18 14:56:16 +00:00
Jim Lin	466f8843f5	[NFC] Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h,td}	2020-02-18 10:49:13 +08:00
Amy Huang	cb36bfa3de	Fix `01b02a73de` to use correct macro spelling and fix unit tests.	2020-02-14 15:58:36 -08:00
Amy Huang	01b02a73de	Don't call computeHostNumPhysicalCores when LLVM_ENABLE_THREADS is off Summary: Fix change from `8404aeb56a` to avoid calling computeHostNumPhysicalCores if LLVM_ENABLE_THREADS is off. Reviewers: rnk, aganea Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74654	2020-02-14 15:09:27 -08:00
Alexandre Ganea	8404aeb56a	[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket. == Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads. By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to. This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market. == The problem == The heavyweight_hardware_concurrency() API was introduced so that only one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std:🧵:hardware_concurrency() -- which can only return processors from the current "processor group". == The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO). When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware core will be used. When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once. Differential Revision: https://reviews.llvm.org/D71775	2020-02-14 10:24:22 -05:00
Stefan Pintilie	dcceab1a0a	[PowerPC] Add new Future CPU for PowerPC in LLVM This is a continuation of D70262 The previous patch as listed above added the future CPU in clang. This patch adds the future CPU in the PowerPC backend. At this point the patch simply assumes that a future CPU will have the same characteristics as pwr9. Those characteristics may change with later patches. Differential Revision: https://reviews.llvm.org/D70333	2019-11-27 14:30:06 -06:00
Florian Hahn	82921bf2ba	[Support] Don't check XCR0 when detecting avx512 on Darwin. Darwin lazily saves the AVX512 context on first use [1]: instead of checking that it already does to figure out if the OS supports AVX512, trust that the kernel will do the right thing and always assume the context save support is available. [1] https://github.com/apple/darwin-xnu/blob/xnu-4903.221.2/osfmk/i386/fpu.c#L174 Reviewers: ab, RKSimon, craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D70453	2019-11-21 09:18:00 +00:00
Craig Topper	ff75bf6ac9	[X86] Add AMD Matisse (znver2) model number to getHostCPUName and compiler-rt's getAMDProcessorTypeAndSubtype. This is the CPUID model used on Ryzen 3000 series (Zen 2/Matisse) CPUs. Patch by Alex James Differential Revision: https://reviews.llvm.org/D70279	2019-11-18 11:57:04 -08:00
Chris Bieneman	34688fafea	Implement `sys::getHostCPUName()` for Darwin ARM Summary: Currently there is no implementation of `sys::getHostCPUName()` for Darwin ARM targets. This patch makes it so that LLVM running on ARM makes reasonable guesses about the CPU features of the host CPU. Reviewers: t.p.northover, lhames, efriedma Reviewed By: efriedma Subscribers: rjmccall, efriedma, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69597	2019-11-05 17:49:16 -08:00
Evandro Menezes	215da6606c	[clang][llvm] Obsolete Exynos M1 and M2	2019-10-30 15:02:59 -05:00
Martin Storsjo	353ac42ce2	[Support, ARM64] Define getHostCPUFeatures for Windows on ARM64 platform Patch by Adam Kallai! Differential Revision: https://reviews.llvm.org/D68139 llvm-svn: 373445	2019-10-02 11:04:55 +00:00
Ulrich Weigand	819c1651f7	[SystemZ] Support z15 processor name The recently announced IBM z15 processor implements the architecture already supported as "arch13" in LLVM. This patch adds support for "z15" as an alternate architecture name for arch13. The patch also uses z15 in a number of places where we used arch13 as long as the official name was not yet announced. llvm-svn: 372435	2019-09-20 23:04:45 +00:00
Craig Topper	5465875e93	[X86] Add support for avx512bf16 for __builtin_cpu_supports and compiler-rt's cpu indicator. llvm-svn: 370915	2019-09-04 16:01:43 +00:00
Craig Topper	5a43fdd313	[X86] Remove what little support we had for MPX -Deprecate -mmpx and -mno-mpx command line options -Remove CPUID detection of mpx for -march=native -Remove MPX from all CPUs -Remove MPX preprocessor define I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything. gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX Differential Revision: https://reviews.llvm.org/D66669 llvm-svn: 370393	2019-08-29 18:09:02 +00:00
Pengfei Wang	e28cbbd5d4	[X86] Support -march=tigerlake Support -march=tigerlake for x86. Compare with Icelake Client, It include 4 more new features ,they are avx512vp2intersect, movdiri, movdir64b, shstk. Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D65840 llvm-svn: 368543	2019-08-12 01:29:46 +00:00
Eric Christopher	1d73e228db	BMI2 support is indicated in bit eight of EBX, not nine. See Intel SDM, Vol 2A, Table 3-8: https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-2a-manual.pdf#page=296 Differential Revision: https://reviews.llvm.org/D65766 llvm-svn: 367929	2019-08-05 21:25:59 +00:00
Ulrich Weigand	0f0a8b7784	[SystemZ] Add support for new cpu architecture - arch13 This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Assembler/disassembler support for new instructions. - CodeGen for new instructions, including new LLVM intrinsics. - Scheduler description for the new processor. - Detection of arch13 as host processor. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365932	2019-07-12 18:13:16 +00:00
Yi Kong	432f48fcd4	[AArch64] Add more CPUs to host detection Returns "cortex-a73" for 3rd and 4th gen Kryo; not precisely correct, but close enough. Differential Revision: https://reviews.llvm.org/D63099 llvm-svn: 363013	2019-06-11 00:05:36 +00:00
Pengfei Wang	f8b28931a7	[X86] -march=cooperlake (llvm) Support intel -march=cooperlake in llvm Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62836 llvm-svn: 362776	2019-06-07 08:31:35 +00:00
Craig Topper	c669629e6c	[X86] Resync Host.cpp with compiler-rt's cpu_model.c to enable 0x55 to be identified as cascadelake when avx512vnni is detected. Some other formatting changes. llvm-svn: 362256	2019-05-31 19:18:07 +00:00
Pengfei Wang	1f67d94279	[X86] Add ENQCMD instructions For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62281 llvm-svn: 362053	2019-05-30 03:59:16 +00:00
Craig Topper	2f1895e03d	[X86] Add more icelake model numbers to getHostCPUName. Using model numbers found in Table 2-1 of the May 2019 version of the Intel Software Developer's Manual Volume 4. llvm-svn: 361422	2019-05-22 19:51:35 +00:00
Craig Topper	cac6b76a76	[X86] Add icelake-client and tremont model numbers to getHostCPUName. llvm-svn: 361174	2019-05-20 16:58:23 +00:00
Luo, Yuanke	beec41c656	Enable AVX512_BF16 instructions, which are supported for BFLOAT16 in Cooper Lake Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed BF16 Data. VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data. VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed Single Precision. For more details about BF16 isa, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Author: LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, RKSimon, spatel Reviewed By: craig.topper Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60550 llvm-svn: 360017	2019-05-06 08:22:37 +00:00

1 2 3 4 5 ...

262 Commits