Distinct Elements In Streams An Algorithm For The (text) Book

Chakraborty Sourav, Vinodchandran N. V., Meel Kuldeep S.. Apppeared in Proceedings of 2023

Given a data stream $A = ⟨ a_{1}, a_{2}, \dots, a_{m} ⟩$ of $m$ elements where each $a_{i} \in [n]$ , the Distinct Elements problem is to estimate the number of distinct elements in $A$ .Distinct Elements has been a subject of theoretical and empirical investigations over the past four decades resulting in space optimal algorithms for it.All the current state-of-the-art algorithms are, however, beyond the reach of an undergraduate textbook owing to their reliance on the usage of notions such as pairwise independence and universal hash functions. We present a simple, intuitive, sampling-based space-efficient algorithm whose description and the proof are accessible to undergraduates with the knowledge of basic probability theory.

Similar Work

Double-hashing Algorithm For Frequency Estimation In Data Streams
High Speed Hashing For Integers And Strings
Description-based Text Similarity
No Repetition Fast Streaming With Highly Concentrated Hashing

Awesome Learning to Hash

Distinct Elements In Streams An Algorithm For The (text) Book

Chakraborty Sourav, Vinodchandran N. V., Meel Kuldeep S.. Apppeared in Proceedings of 2023

Similar Work