Our syscall implementation depends on a specific macro that's only
defined in our headers. If we're not using our headers, then the test
doesn't work. I've disabled the test in this case because there's no
point in testing the system libc's syscall implementation.
Differential Revision: https://reviews.llvm.org/D134994
Add the syscall wrapper function and tests. It's implemented using a
macro to guarantee the minimum number of arguments.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D134919
They were disabled because we were including linux/signal.h from our
signal.h. Linux's signal.h is not designed to be included from user
programs as it causes a lot of non-standard name pollution. Also, it is
not self-contained. This change defines types and macros relevant for
signal related syscalls within libc's headers and removes inclusion of
Linux headers.
This patch enables the funtions only for x86_64. They will be enabled
for aarch64 also in a follow up patch after testing.
Reviewed By: abrachet, lntue
Differential Revision: https://reviews.llvm.org/D134567
On windows, including math.h causes macros for "OVERFLOW" and
"UNDERFLOW" to be defined. This patch renames some variables internal to
FEnvImpl.h to avoid colliding with those.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D134775
The existing thrd_once function has been refactored so that the
implementation can be shared between thrd_once and pthread_once
functions.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D134716
The windows build has fallen behind a little, this patch fixes some
issues that were preventing it from building.
Specifically: Some subfolders weren't being included, leading to missing
targets in the cmake.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D134676
Tested:
Limited unit test: This makes a call and checks that no error was
returned, but we currently don't have the ability to ensure that
time has elapsed as expected.
Co-authored-by: Jeff Bailey <jeffbailey@google.com>
Reviewed By: sivachandra, jeffbailey
Differential Revision: https://reviews.llvm.org/D134095
Previously the mman macros were in api.td, but platform differences are
easier to handle with preprocessor macros so they have been moved to
include. Also I completed the list of macros (at least for what I need
soon) and fixed some previously incorrect values.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D134491
Strerror maps error numbers to strings. Additionally, a utility for
mapping errors to strings was added so that it could be reused for
perror and similar.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D134074
Implement exp10f function correctly rounded to all rounding modes.
Algorithm: perform range reduction to reduce
```
10^x = 2^(hi + mid) * 10^lo
```
where:
```
hi is an integer,
0 <= mid * 2^5 < 2^5
-log10(2) / 2^6 <= lo <= log10(2) / 2^6
```
Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is
performed by adding `hi` into the exponent field of `2^mid`.
`10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with:
```
> P = fpminimax((10^x - 1)/x, 4, [|D...|], [-log10(2)/64. log10(2)/64]);
```
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput : 10.215
System LIBC reciprocal throughput : 7.944
LIBC reciprocal throughput : 38.538
LIBC reciprocal throughput : 12.175 (with `-msse4.2` flag)
LIBC reciprocal throughput : 9.862 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency : 40.744
System LIBC latency : 37.546
BEFORE
LIBC latency : 48.989
LIBC latency : 44.486 (with `-msse4.2` flag)
LIBC latency : 40.221 (with `-mfma` flag)
```
This patch relies on https://reviews.llvm.org/D134002
Reviewed By: orex, zimmermann6
Differential Revision: https://reviews.llvm.org/D134104
Now libc headers can be installed separately from installing the rest of
the libc.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D133960
Reduce the number of subintervals that need lookup table and optimize
the evaluation steps.
Currently, `exp2f` is computed by reducing to `2^hi * 2^mid * 2^lo` where
`-16/32 <= mid <= 15/32` and `-1/64 <= lo <= 1/64`, and `2^lo` is then
approximated by a degree 6 polynomial.
Experiment with Sollya showed that by using a degree 6 polynomial, we
can approximate `2^lo` for a bigger range with reasonable errors:
```
> P = fpminimax((2^x - 1)/x, 5, [|D...|], [-1/64, 1/64]);
> dirtyinfnorm(2^x - 1 - x*P, [-1/64, 1/64]);
0x1.e18a1bc09114def49eb851655e2e5c4dd08075ac2p-63
> P = fpminimax((2^x - 1)/x, 5, [|D...|], [-1/32, 1/32]);
> dirtyinfnorm(2^x - 1 - x*P, [-1/32, 1/32]);
0x1.05627b6ed48ca417fe53e3495f7df4baf84a05e2ap-56
```
So we can optimize the implementation a bit with:
# Reduce the range to `mid = i/16` for `i = 0..15` and `-1/32 <= lo <= 1/32`
# Store the table `2^mid` in bits, and add `hi` directly to its exponent field to compute `2^hi * 2^mid`
# Rearrange the order of evaluating the polynomial approximating `2^lo`.
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp2f
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput : 9.534
System LIBC reciprocal throughput : 6.229
BEFORE:
LIBC reciprocal throughput : 21.405
LIBC reciprocal throughput : 15.241 (with `-msse4.2` flag)
LIBC reciprocal throughput : 11.111 (with `-mfma` flag)
AFTER:
LIBC reciprocal throughput : 18.617
LIBC reciprocal throughput : 12.852 (with `-msse4.2` flag)
LIBC reciprocal throughput : 9.253 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp2f --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency : 40.869
System LIBC latency : 30.580
BEFORE
LIBC latency : 64.888
LIBC latency : 61.027 (with `-msse4.2` flag)
LIBC latency : 48.778 (with `-mfma` flag)
AFTER
LIBC latency : 48.803
LIBC latency : 45.047 (with `-msse4.2` flag)
LIBC latency : 37.487 (with `-mfma` flag)
```
Reviewed By: sivachandra, orex
Differential Revision: https://reviews.llvm.org/D133870
Implement acosf function correctly rounded for all rounding modes.
We perform range reduction as follows:
- When `|x| < 2^(-10)`, we use cubic Taylor polynomial:
```
acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 / 6.
```
- When `2^(-10) <= |x| <= 0.5`, we use the same approximation that is used for `asinf(x)` when `|x| <= 0.5`:
```
acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 * P(x^2).
```
- When `0.5 < x <= 1`, we use the double angle formula: `cos(2y) = 1 - 2 * sin^2 (y)` to reduce to:
```
acos(x) = 2 * asin( sqrt( (1 - x)/2 ) )
```
- When `-1 <= x < -0.5`, we reduce to the positive case above using the formula:
```
acos(x) = pi - acos(-x)
```
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh acosf
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput : 28.613
System LIBC reciprocal throughput : 29.204
LIBC reciprocal throughput : 24.271
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency : 55.554
System LIBC latency : 76.879
LIBC latency : 62.118
```
Reviewed By: orex, zimmermann6
Differential Revision: https://reviews.llvm.org/D133550
This function cannot have any instrumentation because it's
assembly must match exactly what the debugger is expecting.
Previously it was just a list of what sanitizers we expect
libc would be sanitized with but this is untenable.
Update the utility functions for checking exceptional values of math
functions to use cpp::optional return values.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D133134