llvm-project

Commit Graph

Author	SHA1	Message	Date
Jeff Bailey	aa8ab5b213	[libc] Document which date funcs are needed/done Reviewed By: rtenneti Differential Revision: https://reviews.llvm.org/D135501	2022-10-08 00:06:22 +00:00
Tue Ly	e15b2da42f	[libc][math] Simplify tanf implementation and improve its performance. Simplify `tanf` implementation and improve its performance. Completely reuse the implementation of `sinf`, `cosf`, `sincosf` and use the definition `tan(x) = sin(x)/cos(x)`. Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 18.558 System LIBC reciprocal throughput : 49.919 BEFORE: LIBC reciprocal throughput : 36.480 LIBC reciprocal throughput : 27.217 (with `-msse4.2` flag) LIBC reciprocal throughput : 20.205 (with `-mfma` flag) AFTER: LIBC reciprocal throughput : 30.337 LIBC reciprocal throughput : 21.072 (with `-msse4.2` flag) LIBC reciprocal throughput : 15.804 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 56.702 System LIBC latency : 107.206 BEFORE LIBC latency : 97.598 LIBC latency : 91.119 (with `-msse4.2` flag) LIBC latency : 82.655 (with `-mfma` flag) AFTER LIBC latency : 74.560 LIBC latency : 66.575 (with `-msse4.2` flag) LIBC latency : 61.636 (with `-mfma` flag) ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D134575	2022-09-26 21:36:12 -04:00
Michael Jones	a9e0dbefdd	[libc] add fputs and puts add fputs, puts, and the EOF macro that they use. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D134328	2022-09-21 11:10:20 -07:00
Tue Ly	a752460d73	[libc][math] Implement exp10f function correctly rounded to all rounding modes. Implement exp10f function correctly rounded to all rounding modes. Algorithm: perform range reduction to reduce ``` 10^x = 2^(hi + mid) * 10^lo ``` where: ``` hi is an integer, 0 <= mid * 2^5 < 2^5 -log10(2) / 2^6 <= lo <= log10(2) / 2^6 ``` Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is performed by adding `hi` into the exponent field of `2^mid`. `10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with: ``` > P = fpminimax((10^x - 1)/x, 4, [\|D...\|], [-log10(2)/64. log10(2)/64]); ``` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 10.215 System LIBC reciprocal throughput : 7.944 LIBC reciprocal throughput : 38.538 LIBC reciprocal throughput : 12.175 (with `-msse4.2` flag) LIBC reciprocal throughput : 9.862 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 40.744 System LIBC latency : 37.546 BEFORE LIBC latency : 48.989 LIBC latency : 44.486 (with `-msse4.2` flag) LIBC latency : 40.221 (with `-mfma` flag) ``` This patch relies on https://reviews.llvm.org/D134002 Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D134104	2022-09-19 10:01:40 -04:00
Tue Ly	4973eee122	[libc][math] Improve tanhf performance. Optimize the core part of `tanhf` implementation that is to compute `e^x` similar to https://reviews.llvm.org/D133870. Factor the constants and polynomial approximation out so that it can be used for `exp10f` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 13.377 System LIBC reciprocal throughput : 55.046 BEFORE: LIBC reciprocal throughput : 75.674 LIBC reciprocal throughput : 33.242 (with `-msse4.2` flag) LIBC reciprocal throughput : 25.927 (with `-mfma` flag) AFTER: LIBC reciprocal throughput : 26.359 LIBC reciprocal throughput : 18.888 (with `-msse4.2` flag) LIBC reciprocal throughput : 14.243 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 43.365 System LIBC latency : 123.499 BEFORE LIBC latency : 112.968 LIBC latency : 104.908 (with `-msse4.2` flag) LIBC latency : 92.310 (with `-mfma` flag) AFTER LIBC latency : 69.828 LIBC latency : 63.874 (with `-msse4.2` flag) LIBC latency : 57.427 (with `-mfma` flag) ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D134002	2022-09-19 08:43:03 -04:00
Tue Ly	1c89ae71ea	[libc][math] Improve sinhf and coshf performance. Optimize `sinhf` and `coshf` by computing exp(x) and exp(-x) simultaneously. Currently `sinhf` and `coshf` are implemented using the following formulas: ``` sinh(x) = 0.5 (exp(x) - 1) - 0.5(exp(-x) - 1) cosh(x) = 0.5exp(x) + 0.5exp(-x) ``` where `exp(x)` and `exp(-x)` are calculated separately using the formula: ``` exp(x) ~ 2^hi * 2^mid * exp(dx) ~ 2^hi * 2^mid * P(dx) ``` By expanding the polynomial `P(dx)` into even and odd parts ``` P(dx) = P_even(dx) + dx * P_odd(dx) ``` we can see that the computations of `exp(x)` and `exp(-x)` have many things in common, namely: ``` exp(x) ~ 2^(hi + mid) * (P_even(dx) + dx * P_odd(dx)) exp(-x) ~ 2^(-(hi + mid)) * (P_even(dx) - dx * P_odd(dx)) ``` Expanding `sinh(x)` and `cosh(x)` with respect to the above formulas, we can compute these two functions as follow in order to maximize the sharing parts: ``` sinh(x) = (e^x - e^(-x)) / 2 ~ 0.5 * (P_even * (2^(hi + mid) - 2^(-(hi + mid))) + dx * P_odd * (2^(hi + mid) + 2^(-(hi + mid)))) cosh(x) = (e^x + e^(-x)) / 2 ~ 0.5 * (P_even * (2^(hi + mid) + 2^(-(hi + mid))) + dx * P_odd * (2^(hi + mid) - 2^(-(hi + mid)))) ``` So in this patch, we perform the following optimizations for `sinhf` and `coshf`: # Use the above formulas to maximize sharing intermediate results, # Apply similar optimizations from https://reviews.llvm.org/D133870 Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700: For `sinhf`: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinhf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 16.718 System LIBC reciprocal throughput : 63.151 BEFORE: LIBC reciprocal throughput : 90.116 LIBC reciprocal throughput : 28.554 (with `-msse4.2` flag) LIBC reciprocal throughput : 22.577 (with `-mfma` flag) AFTER: LIBC reciprocal throughput : 36.482 LIBC reciprocal throughput : 16.955 (with `-msse4.2` flag) LIBC reciprocal throughput : 13.943 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinhf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 48.821 System LIBC latency : 137.019 BEFORE LIBC latency : 97.122 LIBC latency : 84.214 (with `-msse4.2` flag) LIBC latency : 71.611 (with `-mfma` flag) AFTER LIBC latency : 54.555 LIBC latency : 50.865 (with `-msse4.2` flag) LIBC latency : 48.700 (with `-mfma` flag) ``` For `coshf`: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh coshf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 16.939 System LIBC reciprocal throughput : 19.695 BEFORE: LIBC reciprocal throughput : 52.845 LIBC reciprocal throughput : 29.174 (with `-msse4.2` flag) LIBC reciprocal throughput : 22.553 (with `-mfma` flag) AFTER: LIBC reciprocal throughput : 37.169 LIBC reciprocal throughput : 17.805 (with `-msse4.2` flag) LIBC reciprocal throughput : 14.691 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh coshf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 48.478 System LIBC latency : 48.044 BEFORE LIBC latency : 99.123 LIBC latency : 85.595 (with `-msse4.2` flag) LIBC latency : 72.776 (with `-mfma` flag) AFTER LIBC latency : 57.760 LIBC latency : 53.967 (with `-msse4.2` flag) LIBC latency : 50.987 (with `-mfma` flag) ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133913	2022-09-15 09:20:39 -04:00
Tue Ly	e6226e6b72	[libc][math] Improve exp2f performance. Reduce the number of subintervals that need lookup table and optimize the evaluation steps. Currently, `exp2f` is computed by reducing to `2^hi * 2^mid * 2^lo` where `-16/32 <= mid <= 15/32` and `-1/64 <= lo <= 1/64`, and `2^lo` is then approximated by a degree 6 polynomial. Experiment with Sollya showed that by using a degree 6 polynomial, we can approximate `2^lo` for a bigger range with reasonable errors: ``` > P = fpminimax((2^x - 1)/x, 5, [\|D...\|], [-1/64, 1/64]); > dirtyinfnorm(2^x - 1 - xP, [-1/64, 1/64]); 0x1.e18a1bc09114def49eb851655e2e5c4dd08075ac2p-63 > P = fpminimax((2^x - 1)/x, 5, [\|D...\|], [-1/32, 1/32]); > dirtyinfnorm(2^x - 1 - xP, [-1/32, 1/32]); 0x1.05627b6ed48ca417fe53e3495f7df4baf84a05e2ap-56 ``` So we can optimize the implementation a bit with: # Reduce the range to `mid = i/16` for `i = 0..15` and `-1/32 <= lo <= 1/32` # Store the table `2^mid` in bits, and add `hi` directly to its exponent field to compute `2^hi * 2^mid` # Rearrange the order of evaluating the polynomial approximating `2^lo`. Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp2f GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 9.534 System LIBC reciprocal throughput : 6.229 BEFORE: LIBC reciprocal throughput : 21.405 LIBC reciprocal throughput : 15.241 (with `-msse4.2` flag) LIBC reciprocal throughput : 11.111 (with `-mfma` flag) AFTER: LIBC reciprocal throughput : 18.617 LIBC reciprocal throughput : 12.852 (with `-msse4.2` flag) LIBC reciprocal throughput : 9.253 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp2f --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 40.869 System LIBC latency : 30.580 BEFORE LIBC latency : 64.888 LIBC latency : 61.027 (with `-msse4.2` flag) LIBC latency : 48.778 (with `-mfma` flag) AFTER LIBC latency : 48.803 LIBC latency : 45.047 (with `-msse4.2` flag) LIBC latency : 37.487 (with `-mfma` flag) ``` Reviewed By: sivachandra, orex Differential Revision: https://reviews.llvm.org/D133870	2022-09-14 14:44:25 -04:00
Tue Ly	463dcc8749	[libc][math] Implement acosf function correctly rounded for all rounding modes. Implement acosf function correctly rounded for all rounding modes. We perform range reduction as follows: - When `\|x\| < 2^(-10)`, we use cubic Taylor polynomial: ``` acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 / 6. ``` - When `2^(-10) <= \|x\| <= 0.5`, we use the same approximation that is used for `asinf(x)` when `\|x\| <= 0.5`: ``` acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 * P(x^2). ``` - When `0.5 < x <= 1`, we use the double angle formula: `cos(2y) = 1 - 2 * sin^2 (y)` to reduce to: ``` acos(x) = 2 * asin( sqrt( (1 - x)/2 ) ) ``` - When `-1 <= x < -0.5`, we reduce to the positive case above using the formula: ``` acos(x) = pi - acos(-x) ``` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh acosf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 28.613 System LIBC reciprocal throughput : 29.204 LIBC reciprocal throughput : 24.271 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 55.554 System LIBC latency : 76.879 LIBC latency : 62.118 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133550	2022-09-09 09:55:30 -04:00
Tue Ly	e2f065c2a3	[libc][math] Implement asinf function correctly rounded for all rounding modes. Implement asinf function correctly rounded for all rounding modes. For `\|x\| <= 0.5`, we approximate `asin(x)` by ``` asin(x) = x * P(x^2) ``` where `P(X^2) = Q(X)` is a degree-20 minimax even polynomial approximating `asin(x)/x` on `[0, 0.5]` generated by Sollya with: ``` > Q = fpminimax(asin(x)/x, [\|0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20\|], [\|1, D...\|], [0, 0.5]); ``` When `\|x\| > 0.5`, we perform range reduction as follow: Assume further that `0.5 < x <= 1`, and let: ``` y = asin(x) ``` We will use the double angle formula: ``` cos(2X) = 1 - 2 sin^2(X) ``` and the complement angle identity: ``` x = sin(y) = cos(pi/2 - y) = 1 - 2 sin^2 (pi/4 - y/2) ``` So: ``` sin(pi/4 - y/2) = sqrt( (1 - x)/2 ) ``` And hence: ``` pi/4 - y/2 = asin( sqrt( (1 - x)/2 ) ) ``` Equivalently: ``` asin(x) = y = pi/2 - 2 * asin( sqrt( (1 - x)/2 ) ) ``` Let `u = (1 - x)/2`, then ``` asin(x) = pi/2 - 2 * asin(u) ``` Moreover, since `0.5 < x <= 1`, ``` 0 <= u < 1/4, and 0 <= sqrt(u) < 0.5. ``` And hence we can reuse the same polynomial approximation of `asin(x)` when `\|x\| <= 0.5`: ``` asin(x) = pi/2 - 2 * u * P(u^2). ``` Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf CORE-MATH reciprocal throughput : 23.418 System LIBC reciprocal throughput : 27.310 LIBC reciprocal throughput : 22.741 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 58.884 System LIBC latency : 62.055 LIBC latency : 62.037 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133400	2022-09-07 19:27:47 -04:00
Jeff Bailey	0dcbe0e1df	[libc] Add Buildbot to External Links Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D133186	2022-09-02 14:11:09 +00:00
Tue Ly	647b190a5c	[libc][doc] Update implementation status of atanf and atanhf.	2022-08-31 01:27:23 -04:00
Tue Ly	82d6e77048	[libc] Implement tanf function correctly rounded for all rounding modes. Implement tanf function correctly rounded for all rounding modes. We use the range reduction that is shared with `sinf`, `cosf`, and `sincosf`: ``` k = round(x * 32/pi) and y = x * (32/pi) - k. ``` Then we use the tangent of sum formula: ``` tan(x) = tan((k + y)* pi/32) = tan((k mod 32) * pi / 32 + y * pi/32) = (tan((k mod 32) * pi/32) + tan(y * pi/32)) / (1 - tan((k mod 32) * pi/32) * tan(y * pi/32)) ``` We need to make a further reduction when `k mod 32 >= 16` due to the pole at `pi/2` of `tan(x)` function: ``` if (k mod 32 >= 16): k = k - 31, y = y - 1.0 ``` And to compute the final result, we store `tan(k * pi/32)` for `k = -15..15` in a table of 32 double values, and evaluate `tan(y * pi/32)` with a degree-11 minimax odd polynomial generated by Sollya with: ``` > P = fpminimax(tan(y * pi/32)/y, [\|0, 2, 4, 6, 8, 10\|], [\|D...\|], [0, 1.5]); ``` Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf CORE-MATH reciprocal throughput : 18.586 System LIBC reciprocal throughput : 50.068 LIBC reciprocal throughput : 33.823 LIBC reciprocal throughput : 25.161 (with `-msse4.2` flag) LIBC reciprocal throughput : 19.157 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf --latency GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 55.630 System LIBC latency : 106.264 LIBC latency : 96.060 LIBC latency : 90.727 (with `-msse4.2` flag) LIBC latency : 82.361 (with `-mfma` flag) ``` Reviewed By: orex Differential Revision: https://reviews.llvm.org/D131715	2022-08-12 09:21:05 -04:00
Tue Ly	42f183792c	[libc] Change sinf/cosf range reduction to mod pi/32 to be shared with tanf. Change sinf/cosf range reduction to mod pi/32 to be shared with tanf, since polynomial approximations for tanf on subintervals of length pi/16 do not provide enough accuracy. Reviewed By: orex Differential Revision: https://reviews.llvm.org/D131652	2022-08-11 09:41:45 -04:00
Jeff Bailey	7889c41938	[libc] Website fixes (sidebar and mobile) Add "using" and "status" sections to the sidebar to make getting these easier. Fixed mobile formatting not overflow left and right. Tested: Chrome on Desktop, using mobile restrictions in devtools. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D131369	2022-08-08 18:38:01 +00:00
Jeff Bailey	f493b21e16	[libc] Update look and feel of libc.llvm.org This design is borrowed from the lldb folks (thank you!) to declutter the page. * The version number at the top is removed. * Links are pushed over to a sidebar * The sidebar has headings There are other minor changes: * The warning about this project not being ready is now an RST "warning" * Links to the Bug Reports and the Source Code are Added * Refer to this project as either "The LLVM C LIbrary" or "The libc" Tested: Built locally Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D131242	2022-08-05 18:18:40 +00:00
Tue Ly	131dda9acc	[libc] Implement sincosf function correctly rounded to all rounding modes. Refactor common range reductions and evaluations for sinf, cosf, and sincosf. Added exhaustive tests for sincosf. Performance before the patch: ``` System LIBC reciprocal throughput : 30.205 LIBC reciprocal throughput : 30.533 System LIBC latency : 67.961 LIBC latency : 61.564 ``` Performance after the patch: ``` System LIBC reciprocal throughput : 30.409 LIBC reciprocal throughput : 20.273 System LIBC latency : 67.527 LIBC latency : 61.959 ``` Reviewed By: orex Differential Revision: https://reviews.llvm.org/D130901	2022-08-05 09:58:01 -04:00
Tue Ly	69cc240534	[libc][doc] Update implementation status of tanhf.	2022-08-01 17:45:40 -04:00
Tue Ly	17df74214c	[libc][doc] Update implementation status of exp2f, sinhf, and coshf.	2022-07-31 16:32:21 -04:00
Tue Ly	2ff187fbc9	[libc] Implement cosf function that is correctly rounded to all rounding modes. Implement cosf function that is correctly rounded to all rounding modes. Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master) on Ryzen 1700: Before this patch (not correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh cosf CORE-MATH reciprocal throughput : 19.043 System LIBC reciprocal throughput : 26.328 LIBC reciprocal throughput : 30.955 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh cosf --latency GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 49.995 System LIBC latency : 59.286 LIBC latency : 60.174 ``` After this patch (correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh cosf GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 19.072 System LIBC reciprocal throughput : 26.286 LIBC reciprocal throughput : 13.631 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh cosf --latency GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 49.872 System LIBC latency : 59.468 LIBC latency : 56.119 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D130644	2022-07-29 21:08:31 -04:00
Kirill Okhotnikov	c78144e1c7	[libc][math] Improved performance of exp2f function. New exp2 function algorithm: 1) Improved performance: 8.176 vs 15.270 by core-math perf tool. 2) Improved accuracy. Only two special values left. 3) Lookup table size reduced twice. Differential Revision: https://reviews.llvm.org/D129005	2022-07-28 10:57:16 +02:00
Tue Ly	15b9380dfd	[libc] Change sinf range reduction to mod pi/16 to be shared with cosf. Change `sinf` range reduction to mod pi/16 to be shared with `cosf`. Previously, `sinf` used range reduction `mod pi`, but this cannot be used to implement `cosf` since the minimax algorithm for `cosf` does not converge due to critical points at `pi/2`. In order to be able to share the same range reduction functions for both `sinf` and `cosf`, we change the range reduction to `mod pi/16` for the following reasons: - The table size is sufficiently small: 32 entries for `sin(k * pi/16)` with `k = 0..31`. It could be reduced to 16 entries if we treat the final sign separately, with an extra multiplication at the end. - The polynomials' degrees are reduced to 7/8 from 15, with extra computations to combine `sin` and `cos` with trig sum equality. - The number of exceptional cases reduced to 2 (with FMA) and 3 (without FMA). - The latency is reduced while maintaining similar throughput as before. Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D130629	2022-07-27 12:23:36 -04:00
Tue Ly	628fbbef81	[libc] Use nearest_integer instructions to improve expm1f performance. Use nearest_integer instructions to improve expf performance. Performance tests with CORE-MATH's perf tool: Before the patch: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 10.096 System LIBC reciprocal throughput : 44.036 LIBC reciprocal throughput : 11.575 $ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 42.239 System LIBC latency : 122.815 LIBC latency : 50.122 ``` After the patch: ``` $ ./perf.sh expm1f LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 10.046 System LIBC reciprocal throughput : 43.899 LIBC reciprocal throughput : 9.179 $ ./perf.sh expm1f --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 42.078 System LIBC latency : 120.488 LIBC latency : 41.528 ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D130502	2022-07-26 09:12:37 -04:00
Tue Ly	91ee672062	[libc] Use nearest_integer instructions to improve expf performance. Use nearest_integer instructions to improve expf performance. Performance tests with CORE-MATH's perf tool: Before the patch: ``` $ ./perf.sh expf LIBC-location: /home/lnt/experiment/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 9.860 System LIBC reciprocal throughput : 7.728 LIBC reciprocal throughput : 12.363 $ ./perf.sh expf --latency LIBC-location: /home/lnt/experiment/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 42.802 System LIBC latency : 35.941 LIBC latency : 49.808 ``` After the patch: ``` $ ./perf.sh expf LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH reciprocal throughput : 9.441 System LIBC reciprocal throughput : 7.382 LIBC reciprocal throughput : 8.843 $ ./perf.sh expf --latency LIBC-location: /home/lnt/experiment/llvm/llvm-project/build/projects/libc/lib/libllvmlibc.a GNU libc version: 2.31 GNU libc release: stable CORE-MATH latency : 44.192 System LIBC latency : 37.693 LIBC latency : 44.145 ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D130498	2022-07-26 09:11:27 -04:00
Tue Ly	d883a4ad02	[libc] Implement sinf function that is correctly rounded to all rounding modes. Implement sinf function that is correctly rounded to all rounding modes. - We use a simple range reduction for `pi/16 < \|x\|` : Let `k = round(x / pi)` and `y = (x/pi) - k`. So `k` is an integer and `-0.5 <= y <= 0.5`. Then ``` sin(x) = sin(ypi + kpi) = (-1)^(k & 1) * sin(ypi) ~ (-1)^(k & 1) y * P(y^2) ``` where `yP(y^2)` is a degree-15 minimax polynomial generated by Sollya with: ``` > P = fpminimax(sin(xpi)/x, [\|0, 2, 4, 6, 8, 10, 12, 14\|], [\|D...\|], [0, 0.5]); ``` - Performance benchmark using perf tool from CORE-MATH project (https://gitlab.inria.fr/core-math/core-math/-/tree/master) on Ryzen 1700: Before this patch (not correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.892 System LIBC reciprocal throughput : 25.559 LIBC reciprocal throughput : 29.381 ``` After this patch (correctly rounded): ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh sinf CORE-MATH reciprocal throughput : 17.896 System LIBC reciprocal throughput : 25.740 LIBC reciprocal throughput : 27.872 LIBC reciprocal throughput : 20.012 (with `-msse4.2` flag) LIBC reciprocal throughput : 14.244 (with `-mfma` flag) ``` Reviewed By: zimmermann6 Differential Revision: https://reviews.llvm.org/D123154	2022-07-22 10:07:31 -04:00
Kirill Okhotnikov	5358457089	[libc][docs] Added fmod performance results.	2022-06-27 19:31:54 +02:00
Kirill Okhotnikov	b8e8012aa2	[libc][math] fmod/fmodf implementation. This is a implementation of find remainder fmod function from standard libm. The underline algorithm is developed by myself, but probably it was first invented before. Some features of the implementation: 1. The code is written on more-or-less modern C++. 2. One general implementation for both float and double precision numbers. 3. Spitted platform/architecture dependent and independent code and tests. 4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc. 5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided). 6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication. Performance tests: The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases. `./check.sh <--special\|--worst> fmodf` passed. `CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are ``` GNU libc version: 2.35 GNU libc release: stable 21.166 <-- FPU 51.031 <-- current glibc 37.659 <-- this fmod version. ```	2022-06-24 23:09:14 +02:00
Tue Ly	6441bfb886	[libc][Obvious] Fix hyperlink and typo in math status page.	2022-06-17 09:35:51 -04:00
Tue Ly	72c1effb34	[libc] Add a status page for math functions. Add a status page for math functions. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D127920	2022-06-16 17:41:46 -04:00
Jeff Bailey	ef3db4fcab	Replace Goals and Why section with Introduction Rewrite the introduction of the page to state clearly the goals of LLVM's libc project. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D127174	2022-06-07 06:53:54 +00:00
Alex Brachet	8725dc5e2f	[libc][docs] Use same formatting for headers in source_layout utils looks different from the other directory names in the docs, see https://libc.llvm.org/source_layout.html#the-utils-directory Differential revision: https://reviews.llvm.org/D126211	2022-05-23 21:47:22 +00:00
Michael Jones	12aae7d9a6	[libc][docs] Add doc for libc stdio functions This patch adds a document describing the status of the string functions in LLVM-libc. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D123823	2022-05-12 13:02:23 -07:00
Siva Chandra Reddy	e6d56802f8	[libc][docs] Update the fuzzing doc to better reflect the current state. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D123923	2022-04-20 15:33:20 +00:00
Siva Chandra Reddy	f8cdbeb471	[libc][docs] Remove the description of a "www" directory. We plan to use the "docs" directory as the home for our "www" pages, similar to how it is for the libcxx project.	2022-04-18 07:16:21 +00:00
Siva Chandra Reddy	6b4ee566e9	[libc] Add a doc describing the current status of libc runtimes build. A section briefly mentioning the planned future enhancements has also been included. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D123761	2022-04-18 06:48:43 +00:00
Michael Jones	f14334ffa1	[libc][docs] Add doc for libc string functions This patch adds a document describing the status of the string functions in LLVM-libc. Reviewed By: sivachandra, jeffbailey Differential Revision: https://reviews.llvm.org/D123645	2022-04-14 13:03:01 -07:00
Siva Chandra Reddy	3bfbb68e1e	[libc] Rename libc-integration-test to libc-api-test. Reviewed By: jeffbailey, michaelrj Differential Revision: https://reviews.llvm.org/D122272	2022-03-23 20:25:34 +00:00
Jeff Bailey	171cb8f53f	Rewrite much of the index page for libc The prior page was the proposal doc, this one is now more about what the project intends to do, written in the present tense. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D119379	2022-02-16 03:46:20 +00:00
Jeff Bailey	4465c29906	Move LLVM Proposal to doc directory, create index The LLVM Libc project is no longer just a proposal and should have a webpage tracking the status of the project. This changes puts the pieces into the right place so that the webpage can be created. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D117436	2022-01-29 00:29:31 +00:00
Michael Jones	155f5a6dac	[libc][clang-tidy] fix namespace check for externals Up until now, all references to `errno` were marked with `NOLINT`, since it was technically calling an external function. This fixes the lint rules so that `errno`, as well as `malloc`, `calloc`, `realloc`, and `free` are all allowed to be called as external functions. All of the relevant `NOLINT` comments have been removed, and the documentation has been updated. Reviewed By: sivachandra, lntue, aaron.ballman Differential Revision: https://reviews.llvm.org/D113946	2021-11-30 11:44:24 -08:00
Shao-Ce SUN	0c660256eb	[NFC] Trim trailing whitespace in *.rst	2021-11-15 09:17:08 +08:00
Paula Toth	ab25ed26c6	[libc] Add documentation for clang-tidy checks. Reviewers: sivachandra Reviewed By: sivachandra Subscribers: tschuett, ecnelises, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D82846	2020-07-06 18:15:35 -07:00
Paula Toth	aa6ef6fea0	[libc] Add documentation for integration tests. Reviewers: sivachandra Reviewed By: sivachandra Subscribers: MaskRay, tschuett, ecnelises, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D82907	2020-07-06 12:44:32 -07:00
Kazuaki Ishizaki	0570de73c4	[libc] NFC: Fix trivial typo in comments, documents, and messages Differential Revision: https://reviews.llvm.org/D77462	2020-04-06 16:19:34 +09:00
Siva Chandra Reddy	4d812acba6	[libc] Add a README to the sub-directories under the utils directory. Also, the source layout document has been updated to reflect the current layout of the `utils` directory. Reviewers: PaulkaToast Differential Revision: https://reviews.llvm.org/D74502	2020-02-23 22:11:35 -08:00
Paula Toth	a4f45ee73a	[libc] Lay out framework for fuzzing libc functions. Summary: Added fuzzing test for strcpy and some documentation related to fuzzing. This will be the first step in integrating this with oss-fuzz. Reviewers: sivachandra, abrachet Reviewed By: sivachandra, abrachet Subscribers: gchatelet, abrachet, mgorny, MaskRay, tschuett, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D74091	2020-02-21 19:15:46 -08:00
Paula Toth	3101def847	[libc] Fix typo in header generation docs. Reviewers: sivachandra, abrachet Reviewed By: sivachandra, abrachet Subscribers: libc-commits, MaskRay, tschuett Tags: #libc-project, #llvm Differential Revision: https://reviews.llvm.org/D72248	2020-02-04 11:43:59 -08:00
Siva Chandra Reddy	5b24c08817	[libc] Move all tests to a top level `test` directory. A toplevel target, `check-libc` has also been added. Reviewers: abrachet, phosek Tags: #libc-project Differential Revision: https://reviews.llvm.org/D72177	2020-01-06 10:14:43 -08:00
Siva Chandra Reddy	b47f9eb55d	[libc] Add a TableGen based header generator. Summary: * The Python header generator has been removed. * Docs giving a highlevel overview of the header gen scheme have been added. Reviewers: phosek, abrachet Subscribers: mgorny, MaskRay, tschuett, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D70197	2019-11-22 13:02:24 -08:00
Siva Chandra Reddy	9364107cf3	Illustrate a redirector using the example of round function from math.h. Setup demonstrated in this patch is only for ELF-ish platforms. Also note: 1. Use of redirectors is a temporary scheme. They will be removed once LLVM-libc has implementations for the redirected functions. 2. Redirectors are optional. One can choose to not include them in the LLVM-libc build for their platform. 3. Even with redirectors used, we want to link to the system libc dynamically. Reviewers: dlj, hfinkel, jakehehrlich, phosek, stanshebs, theraven, alexshap Subscribers: mgorny, libc-commits Tags: #libc-project Differential Revision: https://reviews.llvm.org/D69020	2019-11-01 11:06:12 -07:00
Siva Chandra	4380647e79	Add few docs and implementation of strcpy and strcat. Summary: This patch illustrates some of the features like modularity we want in the new libc. Few other ideas like different kinds of testing, redirectors etc are not yet present. Reviewers: dlj, hfinkel, theraven, jfb, alexshap, jdoerfert Subscribers: mgorny, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67867 llvm-svn: 373764	2019-10-04 17:30:54 +00:00

50 Commits