The amount of data stored in data repositories increases every year. This
makes it challenging to link records between different datasets across
companies and even internally, while adhering to privacy regulations. Address
or name changes, and even different spelling used for entity data, can prevent
companies from using private deduplication or record-linking solutions such as
private set intersection (PSI). To this end, we propose a new and efficient
privacy-preserving record linkage (PPRL) protocol that combines PSI and local
sensitive hash (LSH) functions, and runs in linear time. We explain the privacy
guarantees that our protocol provides and demonstrate its practicality by
executing the protocol over two datasets with