Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmWhen Bloom filters don't bloom2020-03-03T14:46:36+00:00
https://blog.cloudflare.com/when-bloom-filters-dont-bloom/
jmModern CPUs are really good at sequential memory access when it's possible to predict memory fetch patterns (see Cache prefetching). Random memory access on the other hand is very costly.
Advanced data structures are very interesting, but beware. Modern computers require cache-optimized algorithms. When working with large datasets, not fitting L3, prefer optimizing for reduced number loads, over optimizing the amount of memory used.
I guess it's fair to say that Bloom filters are great, as long as they fit into the L3 cache. The moment this assumption is broken, they are terrible. This is not news, Bloom filters optimize for memory usage, not for memory access. For example, see the Cuckoo Filters paper.
]]>cloudflare bloom-filters performance data-structures cpu cache l3 hashing perf perftoolshttps://pinboard.in/https://pinboard.in/u:jm/b:0d0316cd680e/