mindspore/third_party/robin_hood/README.md

4.9 KiB

➵ robin_hood unordered map & set Release GitHub license

Travis CI Build Status Appveyor Build Status Codacy Badge Total alerts Language grade: C/C++ Coverage Status

robin_hood::unordered_map and robin_hood::unordered_set is a platform independent replacement for std::unordered_map / std::unordered_set which is both faster and more memory efficient for real-world use cases.

Installation & Usage

Direct Inclusion

  1. Add robin_hood.h to your C++ project.
  2. Use robin_hood::unordered_map instead of std::unordered_map
  3. Use robin_hood::unordered_set instead of std::unordered_set

Conan, the C/C++ Package Manager

  1. Setup your CMakeLists.txt (see Conan documentation on how to use MSBuild, Meson and others) like this:
    project(myproject CXX)
    
    add_executable(${PROJECT_NAME} main.cpp)
    
    include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake) # Include Conan-generated file
    conan_basic_setup(TARGETS) # Introduce Conan-generated targets
    
    target_link_libraries(${PROJECT_NAME} CONAN_PKG::robin-hood-hashing)
    
  2. Create conanfile.txt in your source dir (don't forget to update the version)
    [requires]
    robin-hood-hashing/3.10.0
    
    [generators]
    cmake
    
  3. Install and run Conan, then build your project as always:
    pip install conan
    mkdir build
    cd build
    conan install ../ --build=missing
    cmake ../
    cmake --build .
    
    The robin-hood-hashing package in Conan is kept up to date by Conan contributors. If the version is out of date, please create an issue or pull request on the conan-center-index repository.

Benchmarks

Please see extensive benchmarks at Hashmaps Benchmarks. In short: robin_hood is always among the fastest maps and uses far less memory than std::unordered_map.

Design Choices

  • Two memory layouts. Data is either stored in a flat array, or with node indirection. Access for unordered_flat_map is extremely fast due to no indirection, but references to elements are not stable. It also causes allocation spikes when the map resizes, and will need plenty of memory for large objects. Node based map has stable references & pointers (NOT iterators! Similar to std::unordered_map) and uses const Key in the pair. It is a bit slower due to indirection. The choice is yours; you can either use robin_hood::unordered_flat_map or robin_hood::unordered_node_map directly. If you use robin_hood::unordered_map It tries to choose the layout that seems appropriate for your data.

  • Custom allocator. Node based representation has a custom bulk allocator that tries to make few memory allocations. All allocated memory is reused, so there won't be any allocation spikes. It's very fast as well.

  • Optimized hash. robin_hood::hash has custom implementations for integer types and for std::string that are very fast and falls back to std::hash for everything else.

  • Depends on good Hashing. For a really bad hash the performance will not only degrade like in std::unordered_map, the map will simply fail with an std::overflow_error. In practice, when using the standard robin_hood::hash, I have never seen this happening.

License

Licensed under the MIT License. See the LICENSE file for details.

by martinus