Non-empty Bins With Simple Tabulation Hashing
Aamand Anders, Thorup Mikkel. Arxiv 2018
[Paper]
ARXIV
Independent
We consider the hashing of a set with using a simple
tabulation hash function and analyse the number of
non-empty bins, that is, the size of . We show that the expected size of
matches that with fully random hashing to within low-order terms. We
also provide concentration bounds. The number of non-empty bins is a
fundamental measure in the balls and bins paradigm, and it is critical in
applications such as Bloom filters and Filter hashing. For example, normally
Bloom filters are proportioned for a desired low false-positive probability
assuming fully random hashing (see \url{en.wikipedia.org/wiki/Bloom_filter}).
Our results imply that if we implement the hashing with simple tabulation, we
obtain the same low false-positive probability for any possible input.
Similar Work