Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmCyan4973/xxHash: Extremely fast non-cryptographic hash algorithm2021-02-01T11:48:46+00:00
https://github.com/Cyan4973/xxHash/
jmhashing hash xxhash performance coding speed algorithmshttps://pinboard.in/https://pinboard.in/u:jm/b:57cfebc0c1ce/Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo)2018-06-18T10:23:24+00:00
https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
jmTurns out I was wrong. This is a big one. And everyone should be using it. Hash tables should not be prime number sized and they should not use an integer modulo to map hashes into slots. Fibonacci hashing is just better. Yet somehow nobody is using it and lots of big hash tables (including all the big implementations of std::unordered_map) are much slower than they should be because they don’t use Fibonacci Hashing.
Apparently this is binary multiplicative hashing, and Google's brotli, webp, and Snappy libs all use a constant derived heuristically from a compression test corpus along the same lines (see comments).
(Via Michael Fogleman)]]>algorithms hashing hash fibonacci golden-ratio coding hacks brotli webp snappy hash-tables hashmaps load-distributionhttps://pinboard.in/https://pinboard.in/u:jm/b:9fbbdd34c27e/google/highwayhash: Fast strong hash functions: SipHash/HighwayHash2018-01-12T13:43:51+00:00
https://github.com/google/highwayhash
jm 64 bits and therefore infeasible to reverse. Permuting equalizes the distribution of the resulting bytes. The internal state occupies four 256-bit AVX2 registers. Due to limitations of the instruction set, the registers are partitioned into two 512-bit halves that remain independent until the reduce phase. The algorithm outputs 64 bit digests or up to 256 bits at no extra cost. In addition to high throughput, the algorithm is designed for low finalization cost. The result is more than twice as fast as SipTreeHash.
We also provide an SSE4.1 version (80% as fast for large inputs and 95% as fast for short inputs), an implementation for VSX on POWER and a portable version (10% as fast). A third-party ARM implementation is referenced below.
Statistical analyses and preliminary cryptanalysis are given in https://arxiv.org/abs/1612.06257.'
(via Tony Finch)]]>siphash highwayhash via:fanf hashing hashes algorithms mac google hashhttps://pinboard.in/https://pinboard.in/u:jm/b:c96748eca1a7/BLAKE2: simpler, smaller, fast as MD52016-04-07T22:51:11+00:00
https://blake2.net/blake2.pdf
jmcrypto hash blake2 hashing blake algorithms sha1 sha3 simd performance machttps://pinboard.in/https://pinboard.in/u:jm/b:33cb0a51f577/Trend Micro Locality Sensitive Hash2015-05-18T12:59:31+00:00
https://github.com/trendmicro/tlsh
jma fuzzy matching library. Given a byte stream with a minimum length
of 512 bytes, TLSH generates a hash value which can be used for similarity
comparisons. Similar objects will have similar hash values which allows for
the detection of similar objects by comparing their hash values. Note that
the byte stream should have a sufficient amount of complexity. For example,
a byte stream of identical bytes will not generate a hash value.
Paper here: https://drive.google.com/file/d/0B6FS3SVQ1i0GTXk5eDl3Y29QWlk/edit
via adulau]]>nilsimsa sdhash ssdeep locality-sensitive hashing algorithm hashes trend-micro tlsh hash fuzzy-matching via:adulauhttps://pinboard.in/https://pinboard.in/u:jm/b:35798e024e53/29c3 HashDOS presentation slides (PDF)2013-01-01T15:24:05+00:00
https://131002.net/siphash/siphashdos_29c3_slides.pdf
jmvia:fanf cityhash siphash hash dos security hashdos murmurhashhttps://pinboard.in/u:jm/b:33f909936eac/Avoiding Hash Lookups in a Ruby Implementation2012-09-05T09:13:05+00:00
http://blog.headius.com/2012/09/avoiding-hash-lookups-in-ruby.html
jmvia:declanmcgrath hash optimization ruby performance jruby hashing data-structures big-o optimisationhttps://pinboard.in/https://pinboard.in/u:jm/b:f9de450427ec/Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs [PDF]2010-06-29T10:25:07+00:00
http://www.vldb.org/pvldb/2/vldb09-257.pdf
jmsort merge hash join databases performance cpu simd multicorehttps://pinboard.in/u:jm/b:e3ed61671f24/Why WeakHashMap Sucks2009-09-01T17:06:24+00:00
http://blogs.azulsystems.com/cliff/2007/08/why-weakhashmap.html
jmsoftreferences weakreferences weak references gc java jvm caching hash memory collections vm weakhashmap via:spycedhttps://pinboard.in/u:jm/b:30d2ba377aa4/