Quicksort Largest Bucket And Min-wise Hashing With Limited Independence
Knudsen Mathias Bæk Tejs, Stöckel Morten. Arxiv 2015
[Paper]
ARXIV
Independent
Randomized algorithms and data structures are often analyzed under the
assumption of access to a perfect source of randomness. The most fundamental
metric used to measure how “random” a hash function or a random number
generator is, is its independence: a sequence of random variables is said to be
-independent if every variable is uniform and every size subset is
independent. In this paper we consider three classic algorithms under limited
independence. We provide new bounds for randomized quicksort, min-wise hashing
and largest bucket size under limited independence. Our results can be
summarized as follows.
-Randomized quicksort. When pivot elements are computed using a
-independent hash function, Karloff and Raghavan, J.ACM’93 showed expected worst-case running time for a special version of quicksort.
We improve upon this, showing that the same running time is achieved with only
-independence.
-Min-wise hashing. For a set , consider the probability of a particular
element being mapped to the smallest hash value. It is known that
-independence implies the optimal probability . Broder et al.,
STOC’98 showed that -independence implies it is . We show
a matching lower bound as well as new tight bounds for - and -independent
hash functions.
-Largest bucket. We consider the case where balls are distributed to
buckets using a -independent hash function and analyze the largest bucket
size. Alon et. al, STOC’97 showed that there exists a -independent hash
function implying a bucket of size . We generalize the
bound, providing a -independent family of functions that imply size .
Similar Work