The optimizer is petulant and temperamental. In this case LLVM failed to lower
the the "insert at end" loop used by`vector<unsigned char>` to a `memset` despite
`memset` being substantially faster over a range of bytes.
LLVM has the ability to lower loops to `memset` whet appropriate, but the
odd nature of libc++'s loops prevented the optimization from taking places.
This patch addresses the issue by rewriting the loops from the form
`do [ ... --__n; } while (__n > 0);` to instead use a for loop over a pointer
range (For example: `for (auto *__i = ...; __i < __e; ++__i)`).
This patch also rewrites the asan annotations to unposion all additional memory
at the start of the loop instead of once per iterations. This could potentially
permit false negatives where the constructor of element N attempts to access
element N + 1 during its construction.
The before and after results for the `BM_ConstructSize/vector_byte/5140480_mean`
benchmark (run 5 times) are:
--------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------
Before
------
BM_ConstructSize/vector_byte/5140480_mean 12530140 ns 12469693 ns N/A
BM_ConstructSize/vector_byte/5140480_median 12512818 ns 12445571 ns N/A
BM_ConstructSize/vector_byte/5140480_stddev 106224 ns 107907 ns 5
-----
After
-----
BM_ConstructSize/vector_byte/5140480_mean 167285 ns 166500 ns N/A
BM_ConstructSize/vector_byte/5140480_median 166749 ns 166069 ns N/A
BM_ConstructSize/vector_byte/5140480_stddev 3242 ns 3184 ns 5
llvm-svn: 367183
Summary:
Instead of populating the global LIBCXX_LIBRARIES, we use the link-time
dependency management built into CMake to propagate link flags. This
leads to a cleaner and easier-to-follow build.
Reviewers: phosek, smeenai, EricWF
Subscribers: mgorny, christof, jkorous, dexonsmith, jfb, mstorsjo, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D60969
llvm-svn: 359571
Summary:
Comparing against the empty string should generate much better code that
what it does today.
We can also generate better code when comparing against literals that
are larger than the SSO space.
Reviewers: EricWF
Subscribers: christof, jdoerfert, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D59781
llvm-svn: 357614
Summary:
Add relational benchmark against a string constant.
These can potentially trigger inlining of the operations. We want to
benchmark that.
Reviewers: EricWF
Subscribers: christof, jdoerfert, libcxx-commits
Tags: #libc
Differential Revision: https://reviews.llvm.org/D59512
llvm-svn: 356680
Summary:
This patch treats <filesystem> as a first-class citizen of the dylib,
like all other sub-libraries (e.g. <chrono>). As such, it also removes
all special handling for installing the filesystem library separately
or disabling part of the test suite from the lit command line.
Unlike the previous attempt (r356500), this doesn't remove all the
filesystem tests.
Reviewers: mclow.lists, EricWF, serge-sans-paille
Subscribers: mgorny, christof, jkorous, dexonsmith, jfb, jdoerfert, libcxx-commits
Differential Revision: https://reviews.llvm.org/D59152
llvm-svn: 356518
When I applied r356500 (https://reviews.llvm.org/D59152), I somehow
deleted all of filesystem's tests. I will revert r356500 and re-apply
it properly.
llvm-svn: 356505
Summary:
This patch treats <filesystem> as a first-class citizen of the dylib,
like all other sub-libraries (e.g. <chrono>). As such, it also removes
all special handling for installing the filesystem library separately
or disabling part of the test suite from the lit command line.
Reviewers: mclow.lists, EricWF, serge-sans-paille
Subscribers: mgorny, christof, jkorous, dexonsmith, jfb, jdoerfert, libcxx-commits
Differential Revision: https://reviews.llvm.org/D59152
llvm-svn: 356500
When linking library dependencies, we shouldn't need to export linked
libraries to dependents. We should be explicit about this in
target_link_libraries, otherwise other targets that depend on these such
as sanitizers get repeated (and possibly even conflicting) dependencies.
Differential Revision: https://reviews.llvm.org/D57456
llvm-svn: 352688
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351648
This is a re-application of r345525, which had been reverted by fear of
a regression.
Reviewed as https://reviews.llvm.org/D53994.
Thanks to Denis Yaroshevskiy for the patch.
llvm-svn: 349358
Summary:
Benchmarks for std::sort, std::stable_sort, std::make_heap,
std::sort_heap, std::pop_heap and std::push_heap.
The benchmarks are run with integers and strings, and with different
sorted input.
Reviewers: EricWF
Subscribers: christof, mgrang, ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D53978
llvm-svn: 347329
This patch renames the cxx-benchmark-unittests to check-cxx-benchmarks
and converts the target to use LIT in order to make the tests run faster
and provide better output.
In particular this runs each benchmark in a suite one by one, allowing
more parallelism while ensuring output isn't garbage with multiple threads.
Additionally, it adds the CMake flag '-DLIBCXX_BENCHMARK_TEST_ARGS=<list>'
to specify what options are passed when running the benchmarks.
llvm-svn: 346888
This patch adds the cxx-benchmark-unittests target so we can start
getting test coverage on the benchmarks, including building with
sanitizers. Because we're only looking for test-coverage, the benchmarks
run for the shortest time possible, and in parallel.
The target is excluded from all by default. It only
builds and runs the libcxx configurations of the benchmarks, and not
any versions built against the systems native standard library.
llvm-svn: 346811
The usage of aligned_storage failed to pass the alignment it wanted,
which caused it to have a larger size and alignment that the
std::string's it was intended to store.
This patch manually specifies the alignment, as well as cleaning up
type alias bugs.
llvm-svn: 346779
The benchmarks currently require C++17, however Clang 3.9 doesn't
support -std=c++17 while still supporting all the C++17 features needed
to compile the benchmarks.
This patch makes the benchmark build attempt to fall back to -std=c++1z
when -std=c++17 isn't supported.
See llvm.org/PR39629
llvm-svn: 346744
This reverts r345525. I'm reverting because that patch apparently caused
a regression on certain platforms (see https://reviews.llvm.org/D53994).
Since we don't fully understand the reasons for the regression, I'm
reverting until we can provide a fix we understand.
llvm-svn: 345893
Summary:
Added benchmarks for Construct, Copy, Move, Destroy, Relationals and
Read. On the ones that matter, the benchmarks tests hot and cold data,
and opaque and transparent inputs.
Reviewers: EricWF
Subscribers: christof, ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D53825
llvm-svn: 345611
Summary:
Benchmarks for construct, find, insert and iterate, with sequential
and random ordered inputs.
It also improves the cartesian product benchmark header to allow for
runtime values to be specified in the product.
Reviewers: EricWF
Subscribers: christof, ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D53523
llvm-svn: 345035
Summary:
Benchmarks for construct, copy, move, swap, destroy and invoke, with 8
different input states.
For the cases that matter, it tests with and without allowing constant
value propagation from construction into the operation tested.
This also adds helper functions to generate the cartesian product of
different configurations and generate benchmarks for all of them.
Reviewers: EricWF
Subscribers: christof, ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D53087
llvm-svn: 344415
This is a fairly large patch that implements all of the filesystem NB comments
and the relative paths changes (ex. adding weakly_canonical). These issues
and papers are all interrelated so their implementation couldn't be split up
nicely.
This patch upgrades <experimental/filesystem> to match the C++17 spec and not
the published experimental TS spec. Some of the changes in this patch are both
API and ABI breaking, however libc++ makes no guarantee about stability for
experimental implementations.
The major changes in this patch are:
* Implement NB comments for filesystem (P0492R2), including:
* Implement `perm_options` enum as part of NB comments, and update the
`permissions` function to match.
* Implement changes to `remove_filename` and `replace_filename`
* Implement changes to `path::stem()` and `path::extension()` which support
splitting examples like `.profile`.
* Change path iteration to return an empty path instead of '.' for trailing
separators.
* Change `operator/=` to handle absolute paths on the RHS.
* Change `absolute` to no longer accept a current path argument.
* Implement relative paths according to NB comments (P0219r1)
* Combine `path.cpp` and `operations.cpp` since some path functions require
access to the operations internals, and some fs operations require access
to the path parser.
llvm-svn: 329028
benchmarks/util_smartptr.bench.cpp
Change CRLF to LF.
test/std/localization/locale.categories/category.monetary/locale.money.get/locale.money.get.members/get_long_double_fr_FR.pass.cpp
Consistently comment "\u20ac" as EURO SIGN, its Unicode name, instead of the actual Unicode character.
test/std/utilities/allocator.adaptor/allocator.adaptor.members/construct_type.pass.cpp
Avoid non-ASCII dash.
Fixes D40991.
llvm-svn: 320536
The function num_get<_CharT>::stage2_int_prep makes unnecessary copy of src
into atoms when char_type is char. This can be avoided by creating
a switch on type and just returning __src when char_type is char.
Added the test case to demonstrate performance improvement.
In order to avoid ABI incompatibilities, the changes are guarded
with a macro _LIBCPP_ABI_OPTIMIZED_LOCALE_NUM_GET
Differential Revision: https://reviews.llvm.org/D30268
Reviewed by: EricWF
llvm-svn: 305427
path uses string::append to construct, append, and concatenate paths. Unfortunatly
string::append has a strong exception safety guaranteed and if it can't prove
that the iterator operations don't throw then it will allocate a temporary
string copy to append to. However this extra allocation and copy is very
undesirable for path which doesn't have the same exception guarantees.
To work around this this patch adds string::__append_forward_unsafe which exposes
the std::string::append interface for forward iterators without enforcing
that the iterator is noexcept.
llvm-svn: 285532
This patch fixes a performance bug when constructing or appending to a path
from a string or c-string. Previously we called 'push_back' to append every
single character. This caused multiple re-allocation and copies when at most
one reallocation is necessary. The new behavior is to simply call
`string::append` so it can correctly handle reallocation.
For large strings this change is a ~4x improvement. This also makes our path
faster to construct than libstdc++'s.
llvm-svn: 285530
This patch entirely rewrites the parsing logic for paths. Unlike the previous
implementation this one stores information about the current state; For example
if we are in a trailing separator or a root separator. This avoids the need for
extra lookahead (and extra work) when incrementing or decrementing an iterator.
Roughly this gives us a 15% speedup over the previous implementation.
Unfortunately this implementation is still a lot slower than libstdc++'s.
Because libstdc++ pre-parses and splits the path upon construction their
iterators are trivial to increment/decrement. This makes libc++ lazy parsing
100x slower than libstdc++. However the pre-parsing libstdc++ causes a ton
of extra and unneeded allocations when constructing the string. For example
`path("/foo/bar/")` would require at least 5 allocations with libstdc++
whereas libc++ uses only one. The non-allocating behavior is much preferable
when you consider filesystem usages like 'exists("/foo/bar/")'.
Even then libc++'s path seems to be twice as slow to simply construct compared
to libstdc++. More investigation is needed about this.
llvm-svn: 285526
This patch enables the `cxx-benchmarks` target by default. Note that the target
still has to be manually invoked since it isn't included in the default 'make'
rule.
This patch also gets the benchmarks building w/ GCC. The build previously
required the '-stdlib=libc++' flag but upstream patches to Google Benchmark
now allow the library to build w/ libc++ and GCC.
These changes should make the benchmarks easier to build and test.
llvm-svn: 279999
I've put some work into the Google Benchmark library in order to make it easier
to benchmark libc++. These changes have already been upstreamed into
Google Benchmark and this patch applies the changes to the in-tree version.
The main improvement in the addition of a 'compare_bench.py' script which
makes it very easy to compare benchmarks. For example to compare the native
STL to libc++ you would run:
`$ compare_bench.py ./util_smartptr.native.out ./util_smartptr.libcxx.out`
And the output would look like:
RUNNING: ./util_smartptr.native.out
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy 62 ns 62 ns 10937500
BM_SharedPtrIncDecRef 31 ns 31 ns 23972603
BM_WeakPtrIncDecRef 28 ns 28 ns 23648649
RUNNING: ./util_smartptr.libcxx.out
Benchmark Time CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy 46 ns 46 ns 14957265
BM_SharedPtrIncDecRef 31 ns 31 ns 22435897
BM_WeakPtrIncDecRef 34 ns 34 ns 21084337
Comparing ./util_smartptr.native.out to ./util_smartptr.libcxx.out
Benchmark Time CPU
-----------------------------------------------------
BM_SharedPtrCreateDestroy -0.26 -0.26
BM_SharedPtrIncDecRef +0.00 +0.00
BM_WeakPtrIncDecRef +0.21 +0.21
llvm-svn: 278147
Summary:
This patch does the following:
1. Checks in a copy of the Google Benchmark library into the libc++ repo under `utils/google-benchmark`.
2. Teaches libc++ how to build Google Benchmark against both (A) in-tree libc++ and (B) the platforms native STL.
3. Allows performance benchmarks to be built as part of the libc++ build.
Building the benchmarks (and Google Benchmark) is off by default. It must be enabled using the CMake option `-DLIBCXX_INCLUDE_BENCHMARKS=ON`. When this option is enabled the tests under `libcxx/benchmarks` can be built using the `libcxx-benchmarks` target.
On Linux platforms where libstdc++ is the default STL the CMake option `-DLIBCXX_BUILD_BENCHMARKS_NATIVE_STDLIB=ON` can be used to build each benchmark test against libstdc++ as well. This is useful for comparing performance between standard libraries.
Support for benchmarks is currently very minimal. They must be manually run by the user and there is no mechanism for detecting performance regressions.
Known Issues:
* `-DLIBCXX_INCLUDE_BENCHMARKS=ON` is only supported for Clang, and not GCC, since the `-stdlib=libc++` option is needed to build Google Benchmark.
Reviewers: danalbert, dberlin, chandlerc, mclow.lists, jroelofs
Subscribers: chandlerc, dberlin, tberghammer, danalbert, srhines, hfinkel
Differential Revision: https://reviews.llvm.org/D22240
llvm-svn: 276049
This patch improves the performance of unordered_set's find by 45% when
the value exists within the set. __hash_tables find method
needs to check if it's reached the end of the bucket by constraining the
hash of the current node and checking it against the bucket index. However
constraining the hash is an expensive operations and it can be avoided if the
two unconstrained hashes are equal. This patch applies that optimization.
This patch also adds a top level directory called benchmarks. 'benchmarks/'
is intended to store any/all benchmarks written for the standard library.
Currently nothing is done with files under 'benchmarks/' but I would like
to move towards introducing a formal format and test runner.
llvm-svn: 274423