These lit tests now requires amdgpu-registered-target since they
use clang driver and clang driver passes an LLVM option which
is available only if amdgpu target is registered.
Change-Id: I2df31967409f1627fc6d342d1ab5cc8aa17c9c0c
This patch mainly made the following changes:
1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.
Differential Revision: https://reviews.llvm.org/D89105
GCC 11 will define this macro.
In LLVM, the feature flag only applies to 64-bit mode and we always define the
macro in 32-bit mode. This is different from GCC -m32 in which -mno-sahf can
suppress the macro. The discrepancy can unlikely cause trouble.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D89198
As reported in Bug 42535, `clang` doesn't inline atomic ops on 32-bit
Sparc, unlike `gcc` on Solaris. In a 1-stage build with `gcc`, only two
testcases are affected (currently `XFAIL`ed), while in a 2-stage build more
than 100 tests `FAIL` due to this issue.
The reason for this `gcc`/`clang` difference is that `gcc` on 32-bit
Solaris/SPARC defaults to `-mpcu=v9` where atomic ops are supported, unlike
with `clang`'s default of `-mcpu=v8`. This patch changes `clang` to use
`-mcpu=v9` on 32-bit Solaris/SPARC, too.
Doing so uncovered two bugs:
`clang -m32 -mcpu=v9` chokes with any Solaris system headers included:
/usr/include/sys/isa_defs.h:461:2: error: "Both _ILP32 and _LP64 are defined"
#error "Both _ILP32 and _LP64 are defined"
While `clang` currently defines `__sparcv9` in a 32-bit `-mcpu=v9`
compilation, neither `gcc` nor Studio `cc` do. In fact, the Studio 12.6
`cc(1)` man page clearly states:
These predefinitions are valid in all modes:
[...]
__sparcv8 (SPARC)
__sparcv9 (SPARC -m64)
At the same time, the patch defines `__GCC_HAVE_SYNC_COMPARE_AND_SWAP_[1248]`
for a 32-bit Sparc compilation with any V9 cpu. I've also changed
`MaxAtomicInlineWidth` for V9, matching what `gcc` does and the Oracle
Developer Studio 12.6: C User's Guide documents (Ch. 3, Support for Atomic
Types, 3.1 Size and Alignment of Atomic C Types).
The two testcases that had been `XFAIL`ed for Bug 42535 are un-`XFAIL`ed
again.
Tested on `sparcv9-sun-solaris2.11` and `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D86621
Support -march=sapphirerapids for x86.
Compare with Icelake Server, it includes 14 more new features. They are
amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote,
enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D86503
These features exist in earlier CPUs, but were deprecated on
znver1/znver2. While working on D82731 I accidentally copied
them from the earlier CPU. And nothing caught my mistake. Having
these additional checks would have helped.
The PREFETCHW instruction was originally part of the 3DNow. But
it was given its own CPUID bit on later CPUs just before 3DNow
was deprecated.
We were setting the -mprfchw flag if -m3dnow was passed or the CPU
supported 3dnow unless -mno-prfchw was passed. But -march=native
on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw.
So -march=k8 will behave differently than -march=native on a K8
for example.
So remove this implicit setting from the frontend and instead
enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled.
Also enable PRFCHW flag on amdfam10/barcelona which seems to be
where this CPUID bit was introduced. That CPU also supported
3dnow.
The PREFETCHW instruction was originally part of the 3DNow. But
it was given its own CPUID bit on later CPUs just before 3DNow
was deprecated.
We were setting the -mprfchw flag if -m3dnow was passed or the CPU
supported 3dnow unless -mno-prfchw was passed. But -march=native
on a CPU without the PRFCHW CPUID bit set will pass -mno-prfchw.
So -march=k8 will behave differently than -march=native on a K8
for example.
So remove this implicit setting from the frontend and instead
enable the backend to use PREFETCHW if 3dnow OR prfchw is enabled.
Also enable PRFCHW flag on amdfam10/barcelona which seems to be
where this CPUID bit was introduced. That CPU also supported
3dnow.
The recently announced IBM z15 processor implements the architecture
already supported as "arch13" in LLVM. This patch adds support for
"z15" as an alternate architecture name for arch13.
Corrsponding LLVM support was committed as rev. 372435.
llvm-svn: 372436
-Deprecate -mmpx and -mno-mpx command line options
-Remove CPUID detection of mpx for -march=native
-Remove MPX from all CPUs
-Remove MPX preprocessor define
I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything.
gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX
Differential Revision: https://reviews.llvm.org/D66669
llvm-svn: 370393
Support -march=tigerlake for x86.
Compare with Icelake Client, It include 4 more new features ,they are
avx512vp2intersect, movdiri, movdir64b, shstk.
Patch by Xiang Zhang (xiangzhangllvm)
Differential Revision: https://reviews.llvm.org/D65840
llvm-svn: 368543
This patch series adds support for the next-generation arch13
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10303.
Note: No currently available Z system supports the arch13
architecture. Once new systems become available, the
official system name will be added as supported -march name.
llvm-svn: 365933
This patch enables the following
1) AMD family 17h "znver2" tune flag (-march, -mcpu).
2) ISAs that are enabled for "znver2" architecture.
3) For the time being, it uses the znver1 scheduler model.
4) Tests are updated.
5) This patch is the clang counterpart to D58343
Reviewers: craig.topper
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58344
llvm-svn: 354899
This is skylake-avx512 with the addition of avx512vnni ISA.
Patch by Jianping Chen
Differential Revision: https://reviews.llvm.org/D54792
llvm-svn: 347682
These intrinsics exist in icc. They can be found on the Intel Intrinsics Guide website.
All the backend support is in place to pattern match a load+bswap or a bswap+store pattern to the MOVBE instructions. So we just need to get the frontend to emit the correct IR. The pointer arguments in icc are declared as void so I had to jump through a packed struct to forcing a specific alignment on the load/store. Same trick we use in the unaligned vector load/store intrinsics
Differential Revision: https://reviews.llvm.org/D52586
llvm-svn: 343343
An intrinsic for an old instruction, as described in the Intel SDM.
Reviewers: craig.topper, rnk
Reviewed By: craig.topper, rnk
Differential Revision: https://reviews.llvm.org/D47142
llvm-svn: 333256
The WBNOINVD instruction writes back all modified
cache lines in the processor’s internal cache to main memory
but does not invalidate (flush) the internal caches.
Reviewers: craig.topper, zvi, ashlykov
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D43817
llvm-svn: 329848
Cannon Lake does not support CLWB, therefore it
does not include all features listed under SKX.
Patch by Gabor Buella
Differential Revision: https://reviews.llvm.org/D43459
llvm-svn: 325655
I based that commit on what was in Intel's public documentation here https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
Which specifically said CLWB wasn't until Icelake.
But I've since cross checked with SDE and it thinks these features exist on CNL and ICL. So now I don't know what to believe.
I've added test coverage of the current behavior as part of the revert so at least now have proof of what we're doing.
llvm-svn: 321547
We have cannonlake and icelake inheriting from skylake server in a switch using fallthroughs. But they aren't perfect supersets of skylake server.
llvm-svn: 321504