Incorporating Item Frequency for Differentially Private Set Union

Abstract

We study the problem of releasing the set union of users’ items subject to differential privacy. Previous approaches consider only the set of items for each user as the input. We propose incorporating the item frequency, which is typically available in set union problems, to boost the utility of private mechanisms. However, using the global item frequency over all users would largely increase privacy loss. We propose to use the local item frequency of each user to approximate the global item frequency without incurring additional privacy loss. Local item frequency allows us to design greedy set union mechanisms that are differentially private, which is impossible for previous greedy proposals. Moreover, while all previous works have to use uniform sampling to limit the number of items each user would contribute to, our construction eliminates the sampling step completely and allows our mechanisms to consider all of the users’ items. Finally, we propose to transfer the knowledge of the global item frequency from a public dataset into our mechanism, which further boosts utility even when the public and private datasets are from different domains. We evaluate the proposed methods on multiple real-life datasets.

Publication
AAAI 2022
Lovedeep Gondara
Lovedeep Gondara
Research Scientist

My research interests include machine learning (deep learning) and statistics, with current research focus on large language models, differential privacy, and their applications to healthcare.