Instead, memory is allocated and deallocated using mmap and munmap
syscalls directly.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/D122876
This patch adds the headers for printf. It contains minimal actual code,
and is more intended to be used for design review. The code is not built
yet, and may have minor errors.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D122773
In the previous patch adding -mfma to functions that need it for windows
builds I missed log2f.
Differential Revision: https://reviews.llvm.org/D122693
On Windows the functions that use fma don't properly include the fma
intrinsics unless -mfma is added to the compile options. This patch adds
the compile option to all of the functions that need it.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D122689
Previously, the "fsqrt" instruction was used on all x86_64 platforms
for finding the square root of long doubles. On long double is double
platforms (e.g. windows) this created errors. This patch changes square
root function for long doubles to be the same as the one for doubles if
long doubles are doubles.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D122688
The clang driver to picks up compiler runtime files using full paths.
Without this, at least for aarch64, the driver tries to pick up the
compiler runtime files from the working directory.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D122617
Reduce the polynomial's degree from 7 down to 4.
Currently we use a degree-7 minimax polynomial on an interval of length 2^-7
around 0 to compute `expf`. Based on the suggestion of @santoshn and the RLIBM
project (https://github.com/rutgers-apl/rlibm-all/blob/main/source/float/exp.c)
and the improvement we made with `exp2f` in https://reviews.llvm.org/D122346,
it is possible to have a good polynomial of degree-4 on a subinterval of length
2^(-7) to approximate e^x.
We did try to either reduce the degree of the polynomial down to 3 or increase
the interval size to 2^(-6), but in both cases the number of exceptional values
exploded. So we settle with using a degree-4 polynomial of the interval of
size 2^(-7) around 0.
Reviewed By: sivachandra, zimmermann6, santoshn
Differential Revision: https://reviews.llvm.org/D122418
Reduce the range-reduction table size from 128 entries down to 64 entries, and
reduce the polynomial's degree from 6 down to 4.
Currently we use a degree-6 minimax polynomial on an interval of length 2^-7
around 0 to compute exp2f. Based on the suggestion of @santoshn and the RLIBM
project (https://github.com/rutgers-apl/rlibm-prog/blob/main/libm/float/exp2.c)
it is possible to have a good polynomial of degree-4 on a subinterval of length
2^(-6) to approximate 2^x.
We did try to either reduce the degree of the polynomial down to 3 or increase
the interval size to 2^(-5), but in both cases the number of exceptional values
exploded. So we settle with using a degree-4 polynomial of the interval of
size 2^(-6) around 0.
Reviewed By: michaelrj, sivachandra, zimmermann6, santoshn
Differential Revision: https://reviews.llvm.org/D122346
The main code for the FILE struct is only enabled on platforms that it
works on, but before this patch the tests were included unconditionally.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D122363
Previously, we used empty, non-ELF crti.o, crtn.o, libm.a and libc++.a
files. Instead, we now still use dummies but they are real ELF object
files and archives.
This patch adds aligned_alloc as an entrypoint. Previously it was being
included implicitly.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D122362
We were previously linking to libllvmlibc.a. But, with libllvmlibc.a now
including functions which depend on the loader, we will have to use the
LLVM libc loader as well. To avoid this, we will link to a special
library which is just a collection of SCUDO allocator entrypoints.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D122360
This patch primarily fixes the fenv implementation on Windows, since
Windows uses the MXCSR in place of the x87 status registers for storing
information about the floating point environment. This allows FEnv to
work correctly on Windows, and successfully build.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D121839
All existing loader tests are switched to an integration test added with
the new rule. Also, the getenv test is now enabled as an integration test.
All loader tests have been moved to test/integration. Also, the simple
checker library for the previous loader tests has been moved to a
separate directory of its own.
A follow up change will perform more cleanup of the loader CMake rules
to eliminate now redundent options.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/D122266
An unnecessary dep of the getenv function is removed. From the x86_64
loader, a call to __llvm_libc::memcpy is replaced with call to
__llvm_libc::inline_memcpy.
The platform independent file implementation is not an entrypoint so it
cannot be excluded via the entrypoints.txt file. Hence, we need a
special treatment to exclude it from the build.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D121947
Having a separate flag helps in setting up proper flags when
implementing, say the Linux specialization of File.
Along the way, a signature for a function which is to be used to open
files has been added. The implementation of the function is to be
included in platform specializations.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D121889
This is now possible because we have a platform independent abstraction
for mutexes.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/D121773
Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.
From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision
gives correctly rounded results for all |x| > 2^-4 with 1 exception. When |x| < 2^-25, we use x + x^2 (implemented with a
single fma). And for 2^-25 <= |x| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121574
Implement exp2f function that is correctly rounded for all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121463
Implement expf function that is correctly rounded for all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121440
The new container is used to store atexit callbacks. This way, we avoid
the possibility of the destructor of the container itself getting added
as an at exit callback.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D121350
Add initial support for darwin-aarch64 (macOS M1).
Some differences compared to linux-aarch64:
- `math.h` defined `math_errhandling` by the compiler builtin `__math_errhandling()` but Apple Clang 13.0.0 on M1 does not support `__math_errhandling()` builtin as a macro function or a constexpr function.
- `math.h` defines `UNDERFLOW` and `OVERFLOW` macros.
- Besides 5 usual floating point exceptions: `FE_INEXACT`, `FE_UNDERFLOW`, `FE_OVERFLOW`, `FE_DIVBYZERO`, and `FE_INVALID`, `fenv.h` also has another floating point exception: `FE_FLUSHTOZERO`. The corresponding trap for `FE_FLUSHTOZERO` in the control register is at the different location compared to the status register.
- `FE_FLUSHTOZERO` exception flag cannot be raised with the default CPU floating point operation mode.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D120914
There were some "TODO" messages that were for things that I have already
completed. This patch removes those.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D121232
Previously, the entire support/CPP folder was in one header library,
which meant that a lot of headers were included where they shouldn't be.
This patch splits each header into its own target, as well as adjusting
each place they were included to only include what is used.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D121237
The loader TLS test for x86_64, which now passes, has been enabled.
A future change should enable the test for aarch64 as well.
Reviewed By: jeffbailey
Differential Revision: https://reviews.llvm.org/D121091