This is a roll forward of D101895 with two additional fixes:
Original Patch description:
> This is a follow up on D101524 which:
>
> - simplifies cpu features detection and usage,
> - flattens target dependent optimizations so it's obvious which implementations are generated,
> - provides an implementation targeting the host (march/mtune=native) for the mem* functions,
> - makes sure all implementations are unittested (provided the host can run them).
Additional fixes:
- Fix uninitialized ALL_CPU_FEATURES
- Use non pseudo microarch as it is only supported from Clang 12 on
Differential Revision: https://reviews.llvm.org/D102233
This reverts commit 541f107871 as the bots
are failing with unknown architecture "x86-64-v*". Will let the original
author decide on the right course of action to correct the problem and
reland.
This is a follow up on D101524 which:
- simplifies cpu features detection and usage,
- flattens target dependent optimizations so it's obvious which implementations are generated,
- provides an implementation targeting the host (march/mtune=native) for the mem* functions,
- makes sure all implementations are unittested (provided the host can run them),
- makes sure all implementations are benchmarkable (provided the host can run them).
Differential Revision: https://reviews.llvm.org/D101895
Current implementation defines LIBC_TARGET_MACHINE with the use of CMAKE_SYSTEM_PROCESSOR.
Unfortunately CMAKE_SYSTEM_PROCESSOR is OS dependent and can produce different results.
An evidence of this is the various matchers used to detect whether the architecture is x86.
This patch normalizes LIBC_TARGET_MACHINE and renames it LIBC_TARGET_ARCHITECTURE.
I've added many architectures but we may want to limit ourselves to x86 and ARM.
Differential Revision: https://reviews.llvm.org/D101524
We won't be able to run the compiled program since it will be compiled
for different system. We instead allow passing the CPU features via
CMake option in that case.
Differential Revision: https://reviews.llvm.org/D95203
The feature check should probably be enhanced for non-x86 architectures,
but this change shields them from x86 specific pieces until then.
This patch has been split out from https://reviews.llvm.org/D81533.
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397