This is a implementation of find remainder fmod function from standard libm.
The underline algorithm is developed by myself, but probably it was first
invented before.
Some features of the implementation:
1. The code is written on more-or-less modern C++.
2. One general implementation for both float and double precision numbers.
3. Spitted platform/architecture dependent and independent code and tests.
4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc.
5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided).
6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication.
Performance tests:
The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases.
`./check.sh <--special|--worst> fmodf` passed.
`CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are
```
GNU libc version: 2.35
GNU libc release: stable
21.166 <-- FPU
51.031 <-- current glibc
37.659 <-- this fmod version.
```
This is mostly a mechanical change. In a future pass, all tests from
pthread which create threads will also be converted to integration tests.
Some of thread related features are tightly coupled with the loader. So,
they can only be tested with the in-house loader. Hence, going forward, all
tests which create threads will have to be integration tests.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D128381
This is a reland of D126773 / b2a9ea4420.
The removal of `-mllvm -combiner-global-alias-analysis` has landed separately
in D128051 / 7b73f53790.
And the removal of `-mllvm --tail-merge-threshold=0` is scheduled for
removal in a subsequent patch.
This patch is a subpart of D125768 intented to make the review easier.
The `SizedOp` struct represents operations to be performed on a certain number of bytes.
It is responsible for breaking them down into platform types and forwarded to the `Backend`.
The `Backend` struct represents a lower level abstraction that works only on types (`uint8_t`, `__m128i`, ...).
It is similar to instruction selection.
Differential Revision: https://reviews.llvm.org/D126768
The TLS loader test has been enabled for aarch64.
Handling of PT_TLS' filesz and memsz for x86_64 has also been fixed.
Reviewed By: jeffbailey
Differential Revision: https://reviews.llvm.org/D128032
The pointer converter handles the %p conversion. It uses the hex
converter for most of the conversion.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D127995
In the sign writer, a size_t was being compared to an int. This patch
casts the size_t to an int so that the comparison doesn't cause a sign
comparison warning.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D127984
This patch removes usage of `-mllvm -combiner-global-alias-analysis`
and relies on compiler builtin to implement `memcpy`.
Note that `-mllvm -combiner-global-alias-analysis` is actually only useful for
functions where buffers can alias (namely `memcpy` and `memmove`). The other
memory functions where not benefiting from the flag anyways.
The upside is that the memory functions can now be compiled from source with
thinlto (thinlto would not be able to carry on the flag when doing inlining).
The downside is that for compilers other than clang (i.e. not providing
`__builtin_memcpy_inline`) the codegen may be worse.
Differential Revision: https://reviews.llvm.org/D128051
Previously, any line buffered write of size 0 would cause an error.
The variable used to track the index of the last newline started at
the size of the write - 1, which underflowed. Now it's handled properly,
and a test has been added to prevent regressions.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D127914
This patch adds the entrypoint for printf. With this, building a
"hello world" program with just LLVM-libc is possible.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D126831
Add return values to converter functions to allow for better error
handling when writing files. Also move the file writing code around to
be easier to read.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D127773
Tests for pthread_detach and thrd_detach have not been added. Instead, a
test for the underlying implementation has been added as it makes use of
an internal wait method to synchronize with detached threads.
Reviewed By: lntue, michaelrj
Differential Revision: https://reviews.llvm.org/D127479
Implement double precision FMA (Fused Multiply-Add) for targets without
FMA instructions using __uint128_t to store the intermediate results.
Reviewed By: michaelrj, sivachandra
Differential Revision: https://reviews.llvm.org/D124495
A previous patch added the constant EXP_MANT_MASK to the FloatProperties
for other types of long double. This patch adds it to the special 80-bit
x87 long double.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D127550
Previously all FILE objects were fully buffered, this patch adds line
buffering and unbuffered output, as well as applying them to stdout and
stderr.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D126829
A small refactoring of builtin functions in preparation to adding fmod/fmodf function.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D127088