llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	0d7c2788e4	Remove the misguided extension here that reserved two special values in the hash_code. I'm not sure what I was thinking here, the use cases for special values are in the keys, not in the hashes of those keys. We can always resurrect this if needed, or clients can accomplish the same goal themselves. This makes the general case somewhat faster (~5 cycles faster on my machine) and smaller with less branching. llvm-svn: 151865	2012-03-02 00:48:38 +00:00
Chandler Carruth	396260c484	Re-disable the debug output. The comment is there explaining why we want to keep this around -- updating golden tests is annoying otherwise. Thanks to Benjamin for pointing this omission out on IRC. llvm-svn: 151860	2012-03-01 23:20:45 +00:00
Chandler Carruth	3da579832f	Provide the 32-bit variant of the golden tests. Not sure how I forgot to do this initially, sorry. llvm-svn: 151857	2012-03-01 23:06:19 +00:00
Chandler Carruth	1d03a3b6b1	Rewrite LLVM's generalized support library for hashing to follow the API of the proposed standard hashing interfaces (N3333), and to use a modified and tuned version of the CityHash algorithm. Some of the highlights of this change: -- Significantly higher quality hashing algorithm with very well distributed results, and extremely few collisions. Should be close to a checksum for up to 64-bit keys. Very little clustering or clumping of hash codes, to better distribute load on probed hash tables. -- Built-in support for reserved values. -- Simplified API that composes cleanly with other C++ idioms and APIs. -- Better scaling performance as keys grow. This is the fastest algorithm I've found and measured for moderately sized keys (such as show up in some of the uniquing and folding use cases) -- Support for enabling per-execution seeds to prevent table ordering or other artifacts of hashing algorithms to impact the output of LLVM. The seeding would make each run different and highlight these problems during bootstrap. This implementation was tested extensively using the SMHasher test suite, and pased with flying colors, doing better than the original CityHash algorithm even. I've included a unittest, although it is somewhat minimal at the moment. I've also added (or refactored into the proper location) type traits necessary to implement this, and converted users of GeneralHash over. My only immediate concerns with this implementation is the performance of hashing small keys. I've already started working to improve this, and will continue to do so. Currently, the only algorithms faster produce lower quality results, but it is likely there is a better compromise than the current one. Many thanks to Jeffrey Yasskin who did most of the work on the N3333 paper, pair-programmed some of this code, and reviewed much of it. Many thanks also go to Geoff Pike Pike and Jyrki Alakuijala, the original authors of CityHash on which this is heavily based, and Austin Appleby who created MurmurHash and the SMHasher test suite. Also thanks to Nadav, Tobias, Howard, Jay, Nick, Ahmed, and Duncan for all of the review comments! If there are further comments or concerns, please let me know and I'll jump on 'em. llvm-svn: 151822	2012-03-01 18:55:25 +00:00
Talin	f2291c908b	Hashing.h - utilities for hashing various data types. llvm-svn: 150890	2012-02-18 21:00:49 +00:00

5 Commits