Before, clang's internal assembler would reject the inline asm in clang's
Intrin.h. To make sure this doesn't happen for other Intrin.h functions using
__asm__ blocks, add 32-bit and 64-bit codegen tests for Intrin.h.
Sadly, these tests discovered that __readcr3 and __writecr3 have bad
implementations in 64-bit builds. This will have to be fixed in a follow-up.
llvm-svn: 248234
As discussed in PR23648 - the intrinsics _m_from_int, _m_to_int and _m_prefetch are defined in mmintrin.h and prfchwintrin.h so we don't need to in Intrin.h
Added tests for _m_from_int and _m_to_int
D11338 already added a test for _m_prefetch
Differential Revision: http://reviews.llvm.org/D12272
llvm-svn: 245975
_rotl, _rotwl and _lrotl (and their right-shift counterparts) are official x86
intrinsics, and should be supported regardless of environment. This is in contrast
to _rotl8, _rotl16, and _rotl64 which are MS-specific.
Note that the MS documentation for _lrotl is different from the Intel
documentation. Intel explicitly documents it as a 64-bit rotate, while for MS,
since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is implied.
Differential Revision: http://reviews.llvm.org/D12271
llvm-svn: 245923
_ReadBarrier, _WriteBarrier, and _ReadWriteBarrier are essentially
memory barriers of one form or another. Model these as
atomic_signal_fence(ATOMIC_SEQ_CST).
__faststorefence is a curious intrinsic. It's single purpose seems to
an alternative to mfence when that instruction is slow. However, mfence
is not always slow and is, in general, preferable to a 'lock or'
sequence on certain CPUs. Give the compiler freedom to select the best
sequence to get a fence.
llvm-svn: 242378
Three things:
- The atomic intrinsics mandate memory barriers, let's start emitting
some.
- We don't need to manually create RMW operations, we can just do
__atomic_fetch_foo instead of performing __atomic_foo_fetch and
undoing foo.
- Don't use inline assembly, we don't need it for these intrinsics.
This fixes PR24101.
llvm-svn: 242009
Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64)
These were previously declared in Intrin.h for MSVC compatibility, but now
that we have them implemented, these declarations can be removed.
llvm-svn: 241053
We would crash in the DeclPrinter trying to pretty-print the
static_assert message. C++1z-style assertions don't have a message so
we would crash.
This fixes PR23756.
llvm-svn: 239170
Implement _umul128; it provides the high and low halves of a 128-bit
multiply. We can simply use our __int128 arithmetic to implement this,
we generate great code for it:
movq %rdx, %rax
mulq %rcx
movq %rdx, (%r8)
retq
Differential Revision: http://reviews.llvm.org/D6486
llvm-svn: 223175
The Windows NT SDK uses __readfsdword and declares it as a compiler provided
builtin (#pragma intrinsic(__readfsword). Because intrin.h is not referenced
by winnt.h, it is not possible to provide an out-of-line definition for the
intrinsic. Provide a proper compiler builtin definition.
llvm-svn: 220859
Also provide _setjmpex(). r200243 put in _setjmp() and _setjmpex() behind a
comment since jmp_buf wasn't available. r200344 added jmp_buf and put in
_setjmp(), but missed _setjmpex().
llvm-svn: 212557
Protect MMX specific declarations under a __MMX__ guard. This header can be
included on non-x86 architectures (e.g. ARM) which do not support the MMX ISA.
Use the preprocessor to prevent these declarations from being processed.
llvm-svn: 212512
Conditionally include x86intrin.h if we are building for x86 or x86_64.
Conditionalise definition of inline assembly routines which use x86 or x86_64
inline assembly. This is needed as clang can target Windows on ARM where these
definitions may be included into user code.
llvm-svn: 211716
Add support for _InterlockedCompareExchangePointer, _InterlockExchangePointer,
_InterlockExchange. These are available as a compiler intrinsic on ARM and x86.
These are used directly by the Windows SDK headers without use of the intrin
header.
llvm-svn: 211216
Don't include input and output regs in clobbers. Prefix some
identifiers with __. Add a memory constraint to __readcr3 to prevent
reordering. This constraint is heavy handed, but conservatively
correct.
Thanks to PaX Team for the suggestions.
llvm-svn: 205778
They're already defined in ia32intrin.h, and this would cause including Intrin.h
in 64-bit mode to fail because of conflicting types. Update ms-intrin.cpp to
also run in 64-bit mode to catch things like this.
llvm-svn: 203714
Because GCC incorrectly defines _mm_prefetch to take anything that casts
to void*, people have started using that behavior. The previous patch
that made _mm_prefetch actually take a const char * broke compatibility
with existing code. This update to the patch leaves the macro that
defines _mm_prefetch with the (void*) cast when _MSC_VER is not defined.
llvm-svn: 201901
This breaks backwards compatibility with existing code. Previously, this
was defined as
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
Which basically accepts any pointer. Changing this to char* simply
breaks a lot of existing code. I have tried changing char* to
"const void*", which seems to be the right thing as per Intel
specification this should work on basically any pointer. However,
apparently this breaks windows compatibility (because of a conflicting
declaration in windows.h).
So, we probably need to #ifdef this based on whether clang is compiling
for windows. According to Chandler, this might be done by introducing an
additional symbol to a fake type in BuiltinsX86.def and then condition
the type expansion on the platform.
llvm-svn: 201775
This patch adds several built-ins that are required for ms
compatibility. _mm_prefetch must be a built-in because it takes a
compile-time constant argument and our prior approach of using a #define
to the current built-in doesn't work in the presence of re-declaration
of _mm_prefetch. The others can be obtained by including the windows
system headers. If a user includes the windows system headers but not
intrin.h they still need to work and therefore must be built-in because
we don't get a chance to implement them in intrin.h in this case.
llvm-svn: 201734
The two identical implementations of __cpuid for X86 / X86_64 were
leftovers from my first iteration on the patch that implemented it.
llvm-svn: 200568
This failed the ms-intrin.cpp test.
This reverts commit r200237.
This also comments out the _setjmpex declaration for now so that
intrin.h will work on x64 targets.
llvm-svn: 200243