This is modeled after the half-precision fp support. Two new nodes are
introduced for casting from and to bf16. Since casting from bf16 is a
simple operation I opted to always directly lower it to integer
arithmetic. The other way round is more complicated if you want to
preserve IEEE semantics, so it's handled by a new __truncsfbf2
compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened
fadd on x86.
Possible future improvements:
- Targets with bf16 conversion instructions can now make fp_to_bf16 legal
- The software conversion to bf16 can be replaced by a trivial
implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
ARM EHABI isn't signalled by any specific compiler builtin define,
but is implied by the lack of defines specifying any other
exception handling mechanism, `__USING_SJLJ_EXCEPTIONS__` or
`__ARM_DWARF_EH__`.
As Windows SEH also can be used for unwinding, check for the
`__SEH__` define too, in the same way.
This is the same change as 4a3722a2c3 /
D126866, applied on the compiler-rt builtins gcc_personality_v0
function.
Differential Revision: https://reviews.llvm.org/D126863
Don't build atomic fetch nand libcall functions when the required
compiler builtin isn't available. Without this compiler-rt can't be
built with LLVM 13 or earlier.
Not building the libcall functions isn't optimal, but aligns with the
usecase in FreeBSD where compiler-rt from LLVM 14 is built with an LLVM
13 clang and no LLVM 14 clang is built.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126710
GCC recently started setting constructor priority on init_have_lse_atomics [1]
to avoid undefined initialization order with respect to other initializers,
causing accidental use of ll/sc intrinsics on targets where this was not
intended (which presents a minor performance problem as well as a
compatibility problem for users wanting to use the rr debugger). I initially
thought compiler-rt does not have the same issue as libgcc, since it looks
like we're already setting init priority on the constructor.
Unfortuantely, it does not appear that the HAVE_INIT_PRIORITY check is ever
performed anyway, so despite appearances the init priority was not actually
applied. Fix that by applying the init priority unconditionally. It has been
supported in clang ever since it was first introduced and in any case for
more than 14 years in both gcc and clang. MSVC is already excluded from this
code path and we're already using constructors with init priority elsewhere
in compiler-rt without additional check (though mostly in the sanitizer
runtime, which may have more narrow target support). Regardless, I believe
that for our supported compilers, if they support the constructor attribute,
they should also support init priorities.
While we're here, change the init priority from 101, which is the highest
priority for end user applications, to instead use one of the priority levels
reserved for implementations (1-100; lower integers are higher priority).
GCC ended up using `90`, so this commit aligns the value in compiler-rt
to the same value to ensure that there are no subtle initialization order
differences between libgcc and compiler-rt.
[1] 75c4e4909a
Differential Revision: https://reviews.llvm.org/D126424
The spinlock requires that lock-free operations are available;
otherwise, the implementation just calls itself. As discussed in
D120026.
Differential Revision: https://reviews.llvm.org/D123080
D123200 did not include the generic sources, which means that only the
AVR-specific sources were compiled. With this change, generic sources
are included as expected.
Tested with the following commands:
cmake -G Ninja -DCOMPILER_RT_DEFAULT_TARGET_TRIPLE=avr -DCOMPILER_RT_BAREMETAL_BUILD=1 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_C_FLAGS="--target=avr -mmcu=avr5 -nostdlibinc -mdouble=64" ../path/to/builtins
ninja
Differential Revision: https://reviews.llvm.org/D124969
Previously the default was long, which is 32-bit on AVR. But avr-gcc
expects a smaller value: it reads the return value from r24.
This is actually a regression from https://reviews.llvm.org/D98205.
Before D98205, the return value was an enum (which was 2 bytes in size)
which was compatible with the 1-byte return value that avr-gcc was
expecting. But long is 4 bytes and thus places the significant return
value in a different register.
Differential Revision: https://reviews.llvm.org/D124939
Apply this in add_compiler_rt_runtime instead of manually adding it
to the individual projects. This applies the option on more
parts of compiler-rt than before, but should ideally not make any
difference assuming the other runtimes that lacked the option
also were C11 compatible.
Not marking this as required, to match the existing behaviour (where
`-std=c11` was added only if supported by the compiler).
This was suggested during the review of D110005.
Differential Revision: https://reviews.llvm.org/D124343
Looks like when the VE support was added it was added a few lines below where it should have been.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D123439
According to the RFC [0], this review contains the compiler-rt parts of large integer divison for _BitInt.
It adds the functions
```
/// Computes the unsigned division of a / b for two large integers
/// composed of n significant words.
/// Writes the quotient to quo and the remainder to rem.
///
/// \param quo The quotient represented by n words. Must be non-null.
/// \param rem The remainder represented by n words. Must be non-null.
/// \param a The dividend represented by n + 1 words. Must be non-null.
/// \param b The divisor represented by n words. Must be non-null.
/// \note The word order is in host endianness.
/// \note Might modify a and b.
/// \note The storage of 'a' needs to hold n + 1 elements because some
/// implementations need extra scratch space in the most significant word.
/// The value of that word is ignored.
COMPILER_RT_ABI void __udivmodei5(su_int *quo, su_int *rem, su_int *a,
su_int *b, unsigned int n);
/// Computes the signed division of a / b.
/// See __udivmodei5 for details.
COMPILER_RT_ABI void __divmodei5(su_int *quo, su_int *rem, su_int *a, su_int *b,
unsigned int words);
```
into builtins.
In addition it introduces a new "bitint" library containing only those new functions,
which is meant as a way to provide those when using libgcc as runtime.
[0] https://discourse.llvm.org/t/rfc-add-support-for-division-of-large-bitint-builtins-selectiondag-globalisel-clang/60329
Differential Revision: https://reviews.llvm.org/D120327
Compiler-rt cross-compile for ARMv5 fails because D99282 made it an error if DMB
is used for any pre-ARMv6 targets. More specifically, the "#error only supported
on ARMv6+" added in D99282 will cause compilation to fail when any source file
which includes assembly.h are compiled for pre-ARMv6 targets. Since the only
place where DMB is used is syn-ops.h (which is only included by
arm/sync_fetch_and_* and these files are excluded from being built for older
targets), this patch moves the definition there to avoid the issues described
above.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D123105
dd9173420f (Add clear_cache implementation for ppc64. Fix buffer to
meet ppc64 alignment., 2017-07-28), adds an implementation for
__builtin___clear_cache on powerpc64, which was promptly ammended to
also be used with big endian mode in f67036b62c (This ppc64 implementation
of clear_cache works for both big and little endian., 2017-08-02)
clang will use this implementation for it's builtin on FreeBSD and result
in an abort() in the cases where 32-bit generation was requested (ex in
macppc or when the big endian powerpc64 build was done with "-m32") and as
reported[1] recently with pcre2, but there is no reason why the same code
couldn't be used in those cases, so use instead the more generic identifier
for the PowerPC architecture.
While at it, update the comment to reflect that POWER8/9 have a 128 byte
wide cache line and so the code could instead use 64 byte windows instead
but that possible optimization has been punted for now.
[1] https://github.com/PhilipHazel/pcre2/issues/92
Reviewed By: jhibbits, #powerpc, MaskRay
Differential Revision: https://reviews.llvm.org/D122640
Use Fuchsia's zx_system_get_features API to determine
whether LSE atomics are available on the machine.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D118839
.thumb_func was not switching mode until [1]
so it did not show up but now that .thumb_func (without argument) is
switching mode, its causing build failures on armv6 ( rpi0 ) even when
build is explicitly asking for this file to be built with -marm (ARM
mode), therefore use DEFINE_COMPILERRT_FUNCTION macro to add function
header which considers arch and mode from compiler cmdline to decide if
the function is built using thumb mode or arm mode.
[1] https://reviews.llvm.org/D101975
Note that it also needs https://reviews.llvm.org/D99282
Reviewed By: peter.smith, MaskRay
Differential Revision: https://reviews.llvm.org/D104183
At present compiler-rt cross compiles for armv6 ( -march=armv6 ) but includes
dmb instructions which are only available in armv7+ this causes SIGILL on
clang+compiler-rt compiled components on rpi0w platforms.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D99282
Let `archive-aix-libatomic` accept additional argument to customize name of output atomic library.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D120534
This reverts commit 910a642c0a.
There are serious correctness issues with the current approach: __sync_*
routines which are not actually atomic should not be enabled by default.
I'll continue discussion on the review.
ARMv5 and older architectures don’t support SMP and do not have atomic instructions. Still they’re in use in IoT world, where one has to stick to libgcc.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D116088
`__builtin_c[tl]z` accepts `unsigned int` argument that is not always the
same as uint32_t. For example, `unsigned int` is uint16_t on MSP430.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D86547
Use s[iu]_int instead of `(unsigned) int` and d[ui]_int instead of
`(unsigned) long long` for LibCall arguments.
Note: the `*vfp` LibCall versions were NOT touched.
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D86546
indirect branch tracking(IBT) feature aiming to ensure the target address
of an indirect jump/call is not tampered.
When IBT is enabled, each function or target of any indirect jump/call will start
with an 'endbr32/64' instruction otherwise the program will crash during execution.
To build an application with CET enabled. we need to ensure:
1. build the source code with "-fcf-protection=full"
2. all the libraries linked with .o files must be CET enabled too
This patch aims to enable CET for compiler-rt builtins library, we add an option
"COMPILER_RT_ENABLE_CET" whose default value is OFF to enable CET for compiler-rt
in building time and when this option is "ON", "-fcf-protection=full" is added to
BUILTINS_CFLAG and the "endbr32/64" will be placed in the beginning of each assembly
function. We also enabled CET for crtbegin, crtend object files in this patch.
Reviewed by: MaskRay, compnerd, manojgupta, efriedma
Differential Revision: https://reviews.llvm.org/D109811
Signed-off-by: jinge90 <ge.jin@intel.com>
This allows their reuse across projects. The name of the module
is intentionally generic because we would like to move more platform
checks there.
Differential Revision: https://reviews.llvm.org/D115276
In D116472 we created conditionally defined variables for the tools to
unbreak the legacy build where they are in `llvm/tools`.
The runtimes are not tools, so that flexibility doesn't matter. Still,
it might be nice to define (unconditionally) and use the variable for
the runtimes simply to make the code a bit clearer and document what is
going on.
Also, consistently put project dirs at the beginning, not end of `CMAKE_MODULE_PATH`. This ensures they will properly shadow similarly named stuff that happens to be later on the path.
Reviewed By: mstorsjo, #libunwind, #libc, #libc_abi, ldionne
Differential Revision: https://reviews.llvm.org/D116477
Using the out-of-line LSE atomics helpers for AArch64 on FreeBSD also
requires adding support for initializing __aarch64_have_lse_atomics
correctly. On Linux this is done with getauxval(3), on FreeBSD with
elf_aux_info(3), which has a slightly different interface.
Differential Revision: https://reviews.llvm.org/D109330
Add `__c11_atomic_fetch_nand` builtin to language extensions and support `__atomic_fetch_nand` libcall in compiler-rt.
Reviewed By: theraven
Differential Revision: https://reviews.llvm.org/D112400
There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.
This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.
Differential Revision: https://reviews.llvm.org/D88458
There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.
This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.
Differential Revision: https://reviews.llvm.org/D88458