He Wei
41dcac9c49
Robin-hood-hashing (https://github.com/martinus/robin-hood-hashing) is considered faster then std::unordered_map/set, so we use it to improve mindspore performance. 1. robin_hood head file in `third_party/robin_hood/include`; 2. In `utils/hash_map.h` and `utils/hash_set.h`, we define: - mindspore::HashMap as an alias of robin_hood::unordered_map; - mindspore::HashSet as an alias of robin_hood::unordered_set; 3. Replace: - `#include <unordered_map>` --> `#include "utils/hash_map.h"`; - `#include <unordered_set>` --> `#include "utils/hash_set.h"`; - `std::unordered_map` --> `mindspore::HashMap`; - `std::unordered_set` --> `mindspore::HashSet`; - `map.insert(std::pair(key, value))` --> `map.emplace(key, value)`; - `[] (const std::pair<K, V> &p) {..} ` --> `[] (const auto &p) {..} `; 4. Fix issues found by switch to robin_hood: - AnfNodeConfig hash and equal; - Fix a bug in `Slice::operator==()`; - Fix a bug in `CNode::HasPrimalAttr()`; - Fix map.erase() usage bugs: `map.erase(iter++)` --> `iter = map.erase(iter)`; - Fix some iterator invalidated problem; 5. Some std::unordered_map/set can not replace by robin_hood: - As parameter of functions that exposed to python by pybind11; - Use bad hash that cause robin_hood::map over_flow, such as AbstractBasePtrListHasher; 6. Update cpp unit tests; 7. Add build option '-F' to enable robin_hood, default on. |
||
---|---|---|
.. | ||
include/robin_hood | ||
LICENSE | ||
README.md |
README.md
➵ robin_hood unordered map & set
robin_hood::unordered_map
and robin_hood::unordered_set
is a platform independent replacement for std::unordered_map
/ std::unordered_set
which is both faster and more memory efficient for real-world use cases.
Installation & Usage
Direct Inclusion
- Add
robin_hood.h
to your C++ project. - Use
robin_hood::unordered_map
instead ofstd::unordered_map
- Use
robin_hood::unordered_set
instead ofstd::unordered_set
Conan, the C/C++ Package Manager
- Setup your
CMakeLists.txt
(see Conan documentation on how to use MSBuild, Meson and others) like this:project(myproject CXX) add_executable(${PROJECT_NAME} main.cpp) include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake) # Include Conan-generated file conan_basic_setup(TARGETS) # Introduce Conan-generated targets target_link_libraries(${PROJECT_NAME} CONAN_PKG::robin-hood-hashing)
- Create
conanfile.txt
in your source dir (don't forget to update the version)[requires] robin-hood-hashing/3.10.0 [generators] cmake
- Install and run Conan, then build your project as always:
Thepip install conan mkdir build cd build conan install ../ --build=missing cmake ../ cmake --build .
robin-hood-hashing
package in Conan is kept up to date by Conan contributors. If the version is out of date, please create an issue or pull request on theconan-center-index
repository.
Benchmarks
Please see extensive benchmarks at Hashmaps Benchmarks. In short: robin_hood
is always among the fastest maps and uses far less memory than std::unordered_map
.
Design Choices
-
Two memory layouts. Data is either stored in a flat array, or with node indirection. Access for
unordered_flat_map
is extremely fast due to no indirection, but references to elements are not stable. It also causes allocation spikes when the map resizes, and will need plenty of memory for large objects. Node based map has stable references & pointers (NOT iterators! Similar to std::unordered_map) and usesconst Key
in the pair. It is a bit slower due to indirection. The choice is yours; you can either userobin_hood::unordered_flat_map
orrobin_hood::unordered_node_map
directly. If you userobin_hood::unordered_map
It tries to choose the layout that seems appropriate for your data. -
Custom allocator. Node based representation has a custom bulk allocator that tries to make few memory allocations. All allocated memory is reused, so there won't be any allocation spikes. It's very fast as well.
-
Optimized hash.
robin_hood::hash
has custom implementations for integer types and forstd::string
that are very fast and falls back tostd::hash
for everything else. -
Depends on good Hashing. For a really bad hash the performance will not only degrade like in
std::unordered_map
, the map will simply fail with anstd::overflow_error
. In practice, when using the standardrobin_hood::hash
, I have never seen this happening.
License
Licensed under the MIT License. See the LICENSE file for details.
by martinus