[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# ------------------------------------------------------------------------------
|
2020-02-20 22:00:45 +08:00
|
|
|
# Cpu features definition and flags
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# ------------------------------------------------------------------------------
|
2020-02-20 22:00:45 +08:00
|
|
|
|
|
|
|
if(${LIBC_TARGET_MACHINE} MATCHES "x86|x86_64")
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
set(ALL_CPU_FEATURES SSE SSE2 AVX AVX2 AVX512F)
|
2020-02-20 22:00:45 +08:00
|
|
|
endif()
|
|
|
|
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
list(SORT ALL_CPU_FEATURES)
|
|
|
|
|
|
|
|
# Function to check whether the host supports the provided set of features.
|
|
|
|
# Usage:
|
|
|
|
# host_supports(
|
|
|
|
# <output variable>
|
|
|
|
# <list of cpu features>
|
|
|
|
# )
|
|
|
|
function(host_supports output_var features)
|
|
|
|
_intersection(a "${HOST_CPU_FEATURES}" "${features}")
|
|
|
|
if("${a}" STREQUAL "${features}")
|
|
|
|
set(${output_var} TRUE PARENT_SCOPE)
|
|
|
|
else()
|
|
|
|
unset(${output_var} PARENT_SCOPE)
|
|
|
|
endif()
|
|
|
|
endfunction()
|
|
|
|
|
|
|
|
# Function to compute the flags to pass down to the compiler.
|
|
|
|
# Usage:
|
|
|
|
# compute_flags(
|
|
|
|
# <output variable>
|
|
|
|
# MARCH <arch name or "native">
|
|
|
|
# REQUIRE <list of mandatory features to enable>
|
|
|
|
# REJECT <list of features to disable>
|
|
|
|
# )
|
|
|
|
function(compute_flags output_var)
|
|
|
|
cmake_parse_arguments(
|
|
|
|
"COMPUTE_FLAGS"
|
|
|
|
"" # Optional arguments
|
|
|
|
"MARCH" # Single value arguments
|
|
|
|
"REQUIRE;REJECT" # Multi value arguments
|
|
|
|
${ARGN})
|
|
|
|
# Check that features are not required and rejected at the same time.
|
|
|
|
if(COMPUTE_FLAGS_REQUIRE AND COMPUTE_FLAGS_REJECT)
|
|
|
|
_intersection(var ${COMPUTE_FLAGS_REQUIRE} ${COMPUTE_FLAGS_REJECT})
|
|
|
|
if(var)
|
|
|
|
message(FATAL_ERROR "Cpu Features REQUIRE and REJECT ${var}")
|
|
|
|
endif()
|
|
|
|
endif()
|
|
|
|
# Generate the compiler flags in `current`.
|
|
|
|
if(${CMAKE_CXX_COMPILER_ID} MATCHES "Clang|GNU")
|
|
|
|
if(COMPUTE_FLAGS_MARCH)
|
|
|
|
list(APPEND current "-march=${COMPUTE_FLAGS_MARCH}")
|
|
|
|
endif()
|
|
|
|
foreach(feature IN LISTS COMPUTE_FLAGS_REQUIRE)
|
|
|
|
string(TOLOWER ${feature} lowercase_feature)
|
|
|
|
list(APPEND current "-m${lowercase_feature}")
|
|
|
|
endforeach()
|
|
|
|
foreach(feature IN LISTS COMPUTE_FLAGS_REJECT)
|
|
|
|
string(TOLOWER ${feature} lowercase_feature)
|
|
|
|
list(APPEND current "-mno-${lowercase_feature}")
|
|
|
|
endforeach()
|
2020-02-20 22:00:45 +08:00
|
|
|
else()
|
|
|
|
# In future, we can extend for other compilers.
|
|
|
|
message(FATAL_ERROR "Unkown compiler ${CMAKE_CXX_COMPILER_ID}.")
|
|
|
|
endif()
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# Export the list of flags.
|
|
|
|
set(${output_var} "${current}" PARENT_SCOPE)
|
2020-02-20 22:00:45 +08:00
|
|
|
endfunction()
|
|
|
|
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# ------------------------------------------------------------------------------
|
|
|
|
# Internal helpers and utilities.
|
|
|
|
# ------------------------------------------------------------------------------
|
2020-02-20 22:00:45 +08:00
|
|
|
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# Computes the intersection between two lists.
|
|
|
|
function(_intersection output_var list1 list2)
|
|
|
|
foreach(element IN LISTS list1)
|
|
|
|
if("${list2}" MATCHES "(^|;)${element}(;|$)")
|
|
|
|
list(APPEND tmp "${element}")
|
2020-02-20 22:00:45 +08:00
|
|
|
endif()
|
|
|
|
endforeach()
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
set(${output_var} ${tmp} PARENT_SCOPE)
|
2020-02-20 22:00:45 +08:00
|
|
|
endfunction()
|
|
|
|
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# Generates a cpp file to introspect the compiler defined flags.
|
|
|
|
function(_generate_check_code)
|
|
|
|
foreach(feature IN LISTS ALL_CPU_FEATURES)
|
|
|
|
set(DEFINITIONS
|
|
|
|
"${DEFINITIONS}
|
|
|
|
#ifdef __${feature}__
|
|
|
|
\"${feature}\",
|
|
|
|
#endif")
|
2020-02-20 22:00:45 +08:00
|
|
|
endforeach()
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
configure_file(
|
|
|
|
"${LIBC_SOURCE_DIR}/cmake/modules/cpu_features/check_cpu_features.cpp.in"
|
|
|
|
"cpu_features/check_cpu_features.cpp" @ONLY)
|
2020-02-20 22:00:45 +08:00
|
|
|
endfunction()
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
_generate_check_code()
|
2020-02-20 22:00:45 +08:00
|
|
|
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# Compiles and runs the code generated above with the specified requirements.
|
|
|
|
# This is helpful to infer which features a particular target supports or if
|
|
|
|
# a specific features implies other features (e.g. BMI2 implies SSE2 and SSE).
|
|
|
|
function(_check_defined_cpu_feature output_var)
|
|
|
|
cmake_parse_arguments(
|
|
|
|
"CHECK_DEFINED"
|
|
|
|
"" # Optional arguments
|
|
|
|
"MARCH" # Single value arguments
|
|
|
|
"REQUIRE;REJECT" # Multi value arguments
|
|
|
|
${ARGN})
|
|
|
|
compute_flags(
|
|
|
|
flags
|
|
|
|
MARCH ${CHECK_DEFINED_MARCH}
|
|
|
|
REQUIRE ${CHECK_DEFINED_REQUIRE}
|
|
|
|
REJECT ${CHECK_DEFINED_REJECT})
|
2020-02-20 22:00:45 +08:00
|
|
|
try_run(
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
run_result compile_result "${CMAKE_CURRENT_BINARY_DIR}/check_${feature}"
|
|
|
|
"${CMAKE_CURRENT_BINARY_DIR}/cpu_features/check_cpu_features.cpp"
|
|
|
|
COMPILE_DEFINITIONS ${flags}
|
|
|
|
COMPILE_OUTPUT_VARIABLE compile_output
|
|
|
|
RUN_OUTPUT_VARIABLE run_output)
|
2020-02-20 22:00:45 +08:00
|
|
|
if(${compile_result} AND ("${run_result}" EQUAL 0))
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
set(${output_var}
|
|
|
|
"${run_output}"
|
|
|
|
PARENT_SCOPE)
|
|
|
|
else()
|
|
|
|
message(FATAL_ERROR "${compile_output}")
|
2020-02-20 22:00:45 +08:00
|
|
|
endif()
|
|
|
|
endfunction()
|
|
|
|
|
[libc] Adding memcpy implementation for x86_64
Summary:
The patch is not ready yet and is here to discuss a few options:
- How do we customize the implementation? (i.e. how to define `kRepMovsBSize`),
- How do we specify custom compilation flags? (We'd need `-fno-builtin-memcpy` to be passed in),
- How do we build? We may want to test in debug but build the libc with `-march=native` for instance,
- Clang has a brand new builtin `__builtin_memcpy_inline` which makes the implementation easy and efficient, but:
- If we compile with `gcc` or `msvc` we can't use it, resorting on less efficient code generation,
- With gcc we can use `__builtin_memcpy` but then we'd need a postprocess step to check that the final assembly do not contain call to `memcpy` (unlikely but allowed),
- For msvc we'd need to resort on the compiler optimization passes.
Reviewers: sivachandra, abrachet
Subscribers: mgorny, MaskRay, tschuett, libc-commits, courbet
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D74397
2020-02-11 20:37:02 +08:00
|
|
|
# Populates the HOST_CPU_FEATURES list.
|
|
|
|
_check_defined_cpu_feature(HOST_CPU_FEATURES MARCH native)
|